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ENGINEERING ENZYMES THROUGH GENETIC SELECTTOrr'"'^ 

CROSS-REFERENCED TO RELATED APPLICATIONS 

This application claims benefit of and priority to US Provisional Patent 
5 Application No. 60/520,754 filed on Novennber 17, 2003. US Provisional Patent 
Application No. 60/520,813, also filed on November 17, 2003, and US 

Provisional Patent Application No. 60/ filed on October 18, 2004, and 

where permissible, each of which is incorporated by reference in their entirety. 
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR 
10 DEVELOPMENT 
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may have certain rights in the disclosed subject matter. 

1. Technical Field 

1 5 Aspects of the present disclosure are generally directed to systems and 

methods for generating ligand-receptor pairs for transcriptional control by small 
molecules. 

2. Related Art 

Directed molecular evolution of enzymes is a developing field in the 
20 biotechnology industry and occurs through the single or repeated application of 
two steps: diversity/library generation followed by screening or selecting for 
function. The last several years have produced much progress in each of these 
areas. Techniques of diversity generation in the creation of libraries range from 
methods with no structure/function prejudice (error-prone PGR; mutator strains) 
25 to highly focused randomization based on stmctural information (site-directed 
mutagenesis; cassette mutagenesis). DNA recombination (DNA-shufiling, StEP, 
SCRATCHY, RACHITT, RDA-PCR) requires no structural infonmation but works 
on the premise that Nature has already solved the problem of creating functional 
proteins from amino acids. By randomly recombining the genes for related 
30 proteins, new combinations of the different solutions are created which may be 
better than any of the original individual proteins. Stmcture-based approaches 
can be combined with other methods to generate greater diversity. 

Advances have also been made in screening the generated libraries for 
proteins with desired properties. In a screen each protein in the library is 
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analyzed for function, which limits library size. In contrast, genetic selection 
evaluates entire libraries at once, in a highly parallel fashion, because only 
functional members of the library survive the selective pressure. In selection, 
nonfunctional members of the library are not individually evaluated. For screens. 
5 each variant must be individually assayed and the data evaluated, requiring 
more time and materials. In vivo genetic selection strategies enable the 
exhaustive analysis of protein libraries with up to about 10^° different members. 
The quoted throughputs are maximal values for industrial, robot driven 
laboratories. Realistically, experience indicates that an academic, individual 
1 0 investigator laboratory can achieve up to 1 0^ samples/day for screening in yeast 
and 10' samples/day for genetic selection in yeast. In summary, genetic 
selection is generally preferable to screening not only because it is higher 
throughput, but also because it requires less time and materials. 

With regard to selection, there are several common conventional selection 
1 5 strategies, such as i) antibiotic resistance, ii) substrate selected growth, where 
degradation of substrates provides elements essential for growth (such as C, N. 
P. and S). ili) auxotrophic complementation to restore metabolic function, and iv) 
phage display, which displays peptides or proteins on a vims surface and 
segregates them on the basis of binding affinity. Although powerful, these 
20 selection strategies are not general enough to apply to engineering enzymes for 
many interesting reactions. Conventional systems rely on screening techniques 
rather than selection techniques because selections are more difficult. 

The generation of libraries has spawned many companies, in fact, 
spawned an industry. What has so far failed to be addressed is a general 
25 method of evaluating libraries (no matter how they are generated) through 

genetic selection. Accordingly there is a need for new compositions and methods 
for engineering polypeptides and rapidly identifying engineered polypeptides 
having desirable characteristics. 

SUMMARY 

30 Methods and compositions for selecting or screening transformed cells 

are provided. An exemplary method includes selecting transformed cells by 
introducing a first polynucleotide into a transformed cell unable to survive on 
selective media in the absence of a selection agent, wherein the transfonned cell 
expresses a recombinant receptor polypeptide that activates transcription of a 
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second polynucleotide in response to interaction of the recombinant receptor 
polypeptide with a target substance, culturing the transformed cell on the 
selective media in the absence of the selection agent; and selecting the 
transformed cell that survives on the selective media in the absence of the 
5 selection agent. 

Another aspect provides a method for selecting transformed cells by 
introducing a first polynucleotide into a transformed cell, wherein the transformed 
cell expresses a recombinant receptor polypeptide that activates transcription of 
a second polynucleotide in response to Interaction of the recombinant receptor 

10 polypeptide with a target substance, culturing the transformed cell on the 
selective media in the presence of a first selection agent, and selecting the 
transformed cell that survives on the selective media in the absence of the 
selection agent, wherein the second polynucleotide encodes an enzyme that 
converts the first selective agent into a product toxic to the transformed cell. 

1 5 Still another embodiment provides a cell including a recombinant nuclear 

receptor that induces transcription of a first polynucleotide in response to 
interaction with a target substance, and an adapter fusion protein comprising a 
human coactivator domain operably linked to an activation domain, wherein the 
adapter fusion protein enhances transcription of the first polynucleotide induced 

20 by the recombinant nuclear receptor. 



BRIEF DESCRIPTION OF THE FIGURES 
Fig. 1 shows a schematic depicting an exemplary chemical 
25 complenientation scheme. For selection, yeast strain PJ69-4A has the ADE2 
gene under the control of a Gal4 response element (Gal4RE). This strain is 
transformed with a plasmid expressing ACTR:GAD (manuscript submitted). 
Plasmids created through homologous recombination in PJ69-4A express a 
variant GBO:RXR. In media lacking adenine, yeast will grow only in the presence 
30 of a ligand that causes the RXR LBD to associate with ACTR and activate 
transcription of ADE2. For clarity, only one ACTRiGAD is depicted. 

Figs. 2a-o are line graphs showing selection assay (SC -Ade *Trp -Leu 
ligand) data for yeast growth in the presence of 9cRA (closed circles) and LG335 
(open circles) for 43 hours. 
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Figs. 3a-o are line graphs showing screen assay (SC -Trp -Leu + ligand) 
data for p-galactosidase activity with o-Nitrophenyl p-D-galactopyranoside 
(ONPG) substrate in the presence of 9cRA (closed circles) and LG335 (open 
circles). Miller units normalize the change in absorbance at 405 nm for the 
5 change optical density at 630 nm. which reflects the number of cells per well. 

Figs 4a and b are line graphs showing data from mammalian cell culture 
using a luciferase reporter with wtRXR (solid circle). I268A:I310S;F313A;L436F 
(solid dot). I268V;A272V;I310M;F313S;L436M (inverted triangle), 
I268A;I310M;F313A;L436T (gray square). I268V;A272V;I310L;F313M (upright 
10 triangle), or I268A;I310A:F31 3A;L436F (grey circle) In response to (a) 9cRA and 
LG335 (b). RLU = relative light units. 

Figs. 5a-g are photographs of culture plates showing yeast transformed 
with both ACTRiGAD and GBD:RXR grow in the presence of various 
concentrations of 9cRA. 
1 5 Figs. 6a-g are photographs of culture plates showing yeast transformed 

with both SRC-1 :GAD and GBD.RXR grow in the presence of various 
concentrations of 9cRA. 

Figs. 7a-f are photographs of culture plates showing negative selection of 
yeast transfomied with both ACTR:GAD and GBD:RXR in the presence of 
20 various concentrations of 9cRA. 

Figs. 8a-t are photographs of culture plates showing growth due to the 
indicated transfonnants of variant GBD:RXRs due to various concentrations of 
9cRA. 

Figs. 9a -e are schematics of exemplary embodiments for the selection of 
25 desired transformants. 

Fig. 10 is a schematic of an exemplary embodiment for the selection of 
selective receptor modulators in transformants incorporating a human nuclear 
receptor coactivator fused to a repression domain. 

Fig. 1 1 is a schematic of an exemplary emliodiment for the selection of 
30 receptor antagonists. 

Fig. 12 is a schematic of an exemplary embodiment for chemical 
complementation selection of transformants to obtain isotype or isoform selective 
receptor agonists. 
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Fig. 13 is a schematic of an exemplary embodiment for chemical 
complementation selection of transfomiants incorporating a nuclear receptor 
coactivator fused to an activation domain for the selection of receptor agonists. 
Fig, 14 is a Ligplot depiction of hydrophobic interactions between the 
5 RXR LBD and 9cRA. 

Figs. 15a-b show the structure of exemplary iigands used in chemical 
complementation of one embodiment. 

Figs. 16a-b show schematics of exemplary methods for the construction 
of pGBDRXR:3stop (a) or an insert cassette library (b). 
10 Figs. 17a-b are diagrams of exemplary constructs according to one 

embodiment of the present disclosure. 

DETAILED DESCRIPTION 

Methods and compositions for engineering proteins are provided, in 
particular, methods for engineering proteins that interact with a target compound. 

15 Embodiments of the disclosure combine chemical complementation with genetic 
selection to engineer proteins, polypeptides, enzymes, antibodies, adhesins. 
integrins, and the like. Typically, any protein or polypeptide that interacts with a 
small molecule can be engineered or modified using the disclosed methods and 
systems. Exemplary proteins include, but are not limited to enzymes, antibodies, 

20 cell surface receptors, polypeptides involved in signal transduction pathways, 
intracellular polypeptides, secreted polypeptides, and transmembrane 
polypeptides. In some embodiments, the polypeptides interact with a small 
molecule that Is produced naturally. Representative naturally produced small 
molecules include but are not limited to. neurotransmitters, cAMP, cGMP, 

25 steroids, purines, pyrimidines, heterocyclic compounds, ATP, DAG. IPS, inositol, 
calcium ions, magnesium ions, vitamins, minerals, and combinations thereof. 
Some emfcKDdiments provide methods and systems for engineering proteins that 
distinguish between optical isomers of a target compound. 

Other embodiments provide a more efTicient mammalian model system in 

30 yeast for evaluating protein/ligand interactions, and can be utilized in an array of 
applications including but not limited to drug discovery. Nuclear receptors are 
implicated in diseases such as diabetes and various cancers. Agonists and 
antagonists for these nuclear receptors serve as drugs. With chemical 
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complementation, libraries of compounds can be screened as potential agonists, 
as described herein. In some embodiments, antagonists can be identified with 
negative chemical complementation. Chemical complementation can also be 
extended to identify isotype-selective agonists and antagonists and used for the 
5 discovery of selective receptor modulators (e.g., SERMs). 

In addition to drug discovery, the increase in sensitivity of disclosed 
systems and methods also provides a method for engineering receptors to 
recognize small molecules. For example, libraries of engineered receptors can 
be transformed into yeast and plated onto media containing the target ligand- 

1 0 These engineered receptors can be used for controlling transcription in 

mammalian cells, and potentially applied towards gene therapy. Furthermore, 
some embodiments of the disclosed system can give insight into the general 
mechanism for understanding the fundamentals of protein structure and function. 
In summary, we have demonstrated that the addition of an adapter protein 

1 5 consisting of a human coactivator fused to a yeast transcriptional activator 
increases the sensitivity of chemical complementation with RXR 1000-fold, 
enhancing the system so that it is indistinguishable from activation by Gal4. 
Negative chemical complementation was performed in a different yeast strain, 
showing the versatility of the system, useful for performing chemical 

20 complementation with various selectable markers. This system may be extended 
to the --75 human nuclear receptor proteins, plus nuclear receptors from other 
organisms, and the coactivators and corepressors with which they interact. 

Embodiments of the present disclosure comprise chemical 
complementation systems focusing on one small molecule target ligand and 

25 utilize the power of genetic selection to reveal proteins within the library that bind 
and activate transcription in response to that small molecule. Functional 
receptors from a large pool of non-functional variants can be isolated, even from 
a non-optimized library. 

Chemical complementation is a method which links survival of yeast to 

30 the presence of a small molecule. This process allows high-throughput testing of 
large libraries. Hundreds of thousands to billions of variants can be assayed in 
one experiment without the spatial resolution necessary for traditional screening 
methods (e.g., no need for one colony per well). Yeast can be spread on solid 
media and, through the power of genetic selection, cells expressing active 
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variants will grow into colonies. Survivors can then be spatially resolved (e.g. 
transferred to a microplate, one colony per well) for further characterization, 
decreasing the time and effort required to find new ligand-receptor pairs. 

In one embodiment, among others, chemical complementation identifies 
5 nuclear receptors with a variety of responses to a specific ligand. Nuclear 

receptors that activate transcription in response to targeted molecules and not to 
endogenous compounds have several additional potential applications. The 
ability to switch a gene on and off in response to any desired compound can be 
used to build complex metabolic pathways, gene networks, and to create 

10 conditional knockouts and phenotypes in cell lines and animals. This ability can 
also be useful in gene therapy and In agriculture to control expression of 
therapeutic, pesticidal, or other genes. A variety of responses would be useful in 
engineering biosensor arrays: an array of receptors with differing activation 
profiles for a specific ligand could provide concentration measurements and 

1 5 increased accuracy of detection. 

The ability to engineer proteins that activate transcription in response to 
any desired compound with a variety of activation profiles will provide a general 
method of identifying enzymes. Receptors that bind the product of a desired 
erizymatic reaction can be used to select or screen for enzymes that perform this 

20 reaction. The enzymes may be natural or engineered. The stringency of the 

assay can be adjusted by using ligand-receptor pairs with lower or higher EC50. 
The lack of a general system for genetic selection is currently the limiting step for 
directed evolution of enzymes. 

The human retinoid X receptor (RXR) is a iigand-activated transcription 

25 factor of the nuclear receptor superfamily. RXR plays an important role in 

morphogenesis and differentiation and serves as a dimerization partner for other 
nuclear receptors. Like most nuclear receptors, RXR has two structural domains: 
the DNA binding domain (DBD) and the ligand binding domain (LBD), which are 
connected by a flexible hinge region. The DBD contains two zinc modules, which 

30 bind a sequence of six bases. The LBD binds and activates transcription in 

response to multiple ligands including phytanic add. docasahexaenoic acid and 
9-c/s retinoic acid (9cRA). RXR is a modular protein; the DBD and LBD can 
function independenUy. Therefore, the LBD can t>e fused to other DBDs and 
retain function. A conformational change is induced in the LBD upon ligand 
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binding, which initiates recruitment of coactivators and the basal transcription 
machinery resulting in transcription of the target gene. 

Nuclear receptors have evolved to bind, and activate transcription in 
response to, a variety of small molecule ligands. The known ligands for nuclear 
5 receptors are chemically diverse, including steroid and thyroid hormones, vitamin 
D, prostaglandins, fatty acids, leukotrienes, retinoids, antibiotics, and other 
xenobiotics. Evolutionarily closely related receptors (e.g., thyroid hormone 
receptor and retinoic acid receptor) bind different ligands, whereas some 
members of distant subfamilies (e.g.. RXR and retinoic acid receptor) bind the 

1 0 same ligand. This diversity of ligand-receptor interactions demonstrates the 

versatility of the fold for ligand binding and suggests that it should be possible to 
engineer LBDs with a large range of novel specificities. 

The crystal structure of RXR bound to 9cRA elucidates important 
hydrophobic and polar interactions in the LBD binding pocket. In one 

1 5 emt)odiment, a subset of 20 hydrophobic and polar amino acids within 4.4 A of 
the bound 9cRA are varied to make a library. These residues in RXR are good 
candidates for creating variants that bind different ligands through site directed 
mutagenesis, because side chain atoms, not main chain atoms, contribute the 
majority of the ligand contacts. A library of RXR LBDs with all 20 amino adds at 

20 each of the 20 positions in the llgand-binding pocket screened against multiple 
OTmpounds could potentially produce many new ligand-receptor pairs. However, 
the number of possible combinations (20^^ 10^®) renders saturation 
mutagenesis impractical for constructing a complete library. 

Codon randomization creates protein libraries with mutations at specific 

25 sites. In one emt)odiment, a modified version of the Sauer codon randomization 
method to create a library of binding pocket variants of RXR is provided. This 
library allowed exploration of a vast quantity of sequence space in a minimal 
amount of time. 

Chemical complementation allows testing for the activation of protein 
30 variants by specific ligands using genetic selection. In one embodiment LG335 
was used, a synthetic retinoid-like compound, as a model for discovery of ligand- 
receptor pairs from large libraries using chemical complementation. LG335 was 
previously shown to selectively activate an RXR variant and not activate wild- 
type RXR. Combining chemical complementation with a large library of protein 
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variants decreases the time, effort, and resources necessary to find new 
ligand-receptor pairs. 
Enzyme Engineering 

One embodiment provides methods and compositions for engineering a 
5 polypeptide, for example an enzyme, to produce or interact with a desired 
molecule. Generally, a desired molecule of interest (or the reaction product) is 
chosen, and a target nuclear receptor is also chosen. After the target molecule 
and the target nuclear receptor are selected, modifications to the target nuclear 
receptor can be designed. For example, the X-ray structure of the target nuclear 

10 receptor can be loaded into a modeling program, including, but not limited to 
Insight® or Flexx®. along with the structure of the desired target molecule. 
Specific in silico interactions of the target receptor with the target molecule/I igand 
can be analyzed and those amino acids that may contribute the ligand binding 
can be noted for modification. Generally, a nuclear receptor is selected that has 

15 at least a detectable amount of interaction with the target molecule or ligand or a 
binding pocket of a similar size and shape. The interaction can then be 
modulated as desired by creating a library of modified receptors. 

To create the library, site-specific codon randomization can be used. It will 
be appreciated that any process for generating a library of modified receptors 

20 can be used. Site-specific codon randomization involves modifying the amino 
acids identified through modeling as having or believed to have direct or indirect 
interactions with the ligand. When producing or designing the oligonucleotide, in 
place of those amino acids, there will be a degenerate code based on the 
combination of nucleotides that are desired. For example, if the modification can 

25 be a change from alanine to a cysteine, leucine, phenylalanine, isoleucine, 
threonine, serine, valine and methionine. The nucleotide sequence for the 
alanine is GCC and to possibly incorporate all of the desired amino acids 
mentioned above, the following changes in each position must be made: 
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The oligonucleotide can be designed to have either a T, A, or G in the first 
position, a T or C in the second position, and a G or C in the third position. For 
example, if a TTG (one of the combinations above) is in place of the GCC , that 
5 would incorporate a leucine instead of the alanine. Therefore, when the oligos 
are ordered, you would order them such that you get the possibility of a T, A, or 
G in the first position, a T or C in the second position, and a G or C in the third 
position. The oligonucleotides may be designed to include insertions or 
deletions. The oligonucleotides have ends that are homologous to the vector in 
10 which the gene will be introduced to. 

In one embodiment, to create a receptor library, the vector into which the 
gene will be Incorporated will be cut with restriction enzymes, deleting a 
fragment of the wild-type gene. Oligonucleotides will be designed with 
homologous ends to the vector as mentioned above, but these oligonucleotides 
1 5 will also be designed such that they overlap each other. The overlapping ends 
will hybridize to each other, and using for example the enzyme Klenow, the ends 
are filed in. Then using the polymerase chain reaction (PGR) the full gene or a 
fragment thereof will be amplified. After both of these products are made, these 
genes will be introduced into chemical complementation. The vector and gene 
20 will be introduced into yeast using transformation protocols, for example 

protocols introduced by Gietz and co-workers. During transformation, the vector 
and gene or gene fragment will homologously recombine. and the various 
receptor mutants will be expressed. 

To select for variants that bind the desired small molecule, chemical 
25 complementation is be used. Chemical complementation Is a general method of 
linking any small molecule to genetic selection. Chemical complementation is a 
new derivative of the yeast two-hybrid system, a three-component system that in 
one embodiment comprises a human nuclear receptor protein, its coactivator 
protein, and a small molecule ligand, where the nuclear receptor and coactivator 
30 associate and activate transcription only in the presence of the ligand. An 
exemplary yeast strain contains a Gal4 response element fused to the ADE2 
gene. If adenine is not provided in the medium, the yeast will not be able to 
survive unless they are able to make their own. and to do that, expression of 
ADE2 needs to be activated. The following exemplary plasmids can be utilized: 
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1^^ plasmid encodes a fusion protein of the Gal4 DNA binding donriain {Gal4 
DBD) fused to the variant receptor ligand*binding domain (LBD); the other fusion 
protein comprises a human coactivator protein fused to the Gal4 activation 
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Figure 1 . Creating a library of receptors to 
bind the desired small molecule. On the 
left is the scheme for creating the vector 
cassette and the variant receptors. Once 
these genes are made, they are introduced 
Into yeast and put through chemical 
complementation shown to the right. If the 
variant receptor is able to bind and activate 
in response to the ligand, the yeast will be 
able to grow on media lacking adenine 
because the ADE2 will be turned on. 
Colonies that are able to grow on plates 
containing the small molecule and no 
adenine are "hits'* and will then be 



domain. In the presence of ligand, the 

25 ligand will bind to the variant receptor 

ligand-binding domain and the Gal4 DNA 
binding domain will bind to the Gal4 

30 response element. This will cause the protein to undergo a conformational 

change, and will recruit the coactivator fused to the Gal4 activation domain. This, 
in turn, will result in RNA polymerase being recruited and activation of 
transcription of the downstream gene. 

The transformed yeast from above will be plated onto plates containing 

35 the desired small molecule. Through chemical complementation, the variant 
receptor that is able to bind the desired molecule and activate the ADE2 gene 
allowing that yeast colony to grow. The plasmid from that colony will be rescued 
and sequenced and an engineered receptor will be identified and will be carried 
on to the next step. It will be appreciated that there may be many variant 

40 receptors that allow the yeast to grow without binding the targetted ligand. For 
example, they may be constitutively active or bind an endogenous small 
molecule. These receptors may be identified through screening without the 
targetted ligand. Alternatively, they may be removed from the library by negative 
genetic selectiori on media without the targetted ligand. either before or after 
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chemical complementation. Once an engineered receptor has been created, this 
gene can be integrated into the yeast genome, for example via homologous 
recombination. This will create a new strain that will be used in the following 
process. 

Once the receptor that can bind the small molecule has been identified, 
individual enzymes or a library of enzymes can be evaluated to generate the 
product of interest Libraries of naturally ocurring enzymes, for example 
expression cDNA libraries, may be evaluated. Also, libraries of enzymes can be 
created using a number of mutagenic protocols, such as DNA shuffling, 
RACHITT, En^or-Prone PGR. to name a few. For example, an enzyme that is 
suspected of interacting with the target molecule can be selected and 
mutagenized with conventional techniques. Alternatively, yeast or 
microorganisms can be randomly mutated. 

' one 
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Figure 2 



Cells grow on media lacking 
adenine with precursors A and B 



embodiment, 



complementation Is used to identify the engineered enzyme. In this embodiment 
the library of engineered enzymes will be introduced into the yeast strain 
transformed with the modified nuclear receptor described above. This yeast 
strain has a variant receptor integrated into its genome, and the variant receptor 
is able to bind the product molecule. Once the engineered enzymes have been 
transformed into the yeast strain, the yeast will be spread onto selective plates 
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(for example plates lacking adenine) containing the reactants involved in the 
enzymatic reaction that can be used to synthesize the missing product. The 
yeast will be able to take the reactants and if the yeast express an engineered 
enzyme that can convert the reactants to the reaction product, then the yeast will 
5 survive. The yeast will survive because the reaction product will be able to bind 
to the variant receptor, and activate transcription of the ADE2 gene or other 
selection gene. The DNA from the yeast colony that grew will be rescued and 
sequenced. 

Target compounds that serve as ligands can be selected from any variety 
10 of natural or synthetic compounds. In one embodiment, natural products with 
agricultural or medicinal applications can be selected as target compounds. The 
search for natural products as potential agrochemical agents has increased due 
to the demand for crop protection chemicals. In 1990. the worid market value of 
pesticides totaled nearly $23 billion. Synthetic chemical pesticides are used to 
1 5 protect crops but several developments have triggered the search for alternative 
compounds. First, resistance has developed against synthetic chemical 
pesticides. Second, concern has arisen regarding potential human health risks. 
Third, there is a growing awareness of environmental damage, such as 
contamination of soil, water, and air. New environmentally friendly methods are 
20 being pursued to rectify these problems. In one embodiment of the present 
disclosure, the disclosed methods can be used to identify new prototype 
pesticides in natural products produced by microorganisms, for example, which 

are perceived as more 
environmentally friendly and 
acceptable. The natural 
products would be applied as 
the synthetic chemical 
pesticides have been or the 
biosynthetic genes would be 
expressed in transgenic plants. 
This strategy has been widely 
applied using the Bacillus 
thuringiensis toxin. In another embodiment, genes for toxins are delivered to 
target pest species using insect-specific viruses that leave beneficial insects 




Figure 1. Compounds targeted as ligands. 
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unharmed. These "greener" technologies require not only identification of active 
natural products but also the genes for their biosynthesis. With these 
applications in mind, and because of their availability, three compounds have 
been chosen as target ligands, Barbamide and jaspamide are relevant to the 
5 agricultural industry. Resveratrol has antiviral, antimicrobial, and anticancer 
effects. 

Barbamide is a natural product from the marine cyanobacterium, Lyngbya 
majuscula. From 295 g of algae, 258 mg of pure barbamide can be isolated. 
This chlorinated lipopeptide has potent mollucuscidal activity. The gene cluster 

10 for barbamide biosynthesis from L majuscula has been cloned and analyzed. 
An -26 kb region of DNA from this organism specifies the biosynthesis of 
barbamide. The gene duster revealed 12 open reading frames and it is believed 
that barbamide is synthesized from acetate, L-phenylalanine, L-cysteine, and L- 
leucine. Polyketide synthase and non-ribosomal peptide synthetase modules 

15 accomplish biosynthesis. A trichloroleucine intenmediate is involved, but an 
unresolved issue is its tranfer between modules. The total synthesis of 
barbamide has been reported. 

Jaspamide was isolated from various marine sponges and exhibits 
insecticidal (against Heliothis virescens) and fungicidal activity (against Candida 

20 albicans). It is completely inactive against a series of Gram negative and Gram- 
positive bacteria. From 700 g of sponge tissue. 80 mg of pure jaspamide was 
isolated. The biosynthetic pathway has not been elucidated, but its structure 
suggests polyketide synthase and non-ribosomal peptide synthetase modules. 
Since it is a fungicide, a bacterial chemical complementation system for 

26 engineering nuclear receptors and discovering the genes involved in the 
bk)synthesis of this compound would be used. 

Resveratrol is a stilbene phytoalexin that is produced in at least 72 plant 
species. Phytoalexins are low molecular weight antimicrobial metabolites that 
are produced by plants for protection against a wide range of pathogens. Some 

30 nuclear receptors are known to bind resveratrol, making the DNA shuffling 

approach to engineer a receptor highly relevant. This compound is commercially 
available on the gram scale. 
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Figure 1. Scheme for using nuclear receptors with genetic 
selection strategy for the directed evolution of amine 
dehydrogenases (AmDH). The nuclear receptor is a dimer 
bound to DNA at the GaI4 response element (GalRE) through 
the Gal4 DNA binding domain (DBD), regulating 
transcription of an essential gene (either HISS or A DE2). 
First, a nuclear receptor ligand-binding domain (LBD) is 
engineered to activate transcription in response to the desired 
(R)-amine. Second, libraries of AADH are transfomied into 
the microbe and grown on media supplemented with the 
appropriate ketone. Only microbes with a functional AmDH 
that converts the ketone into the (R)-amine survive. 
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enzynfies such as formate dehydrogenase (FDH). 

The starting enzyme is typically examined for, albeit small, levels of 
activity against a substrate, for example the ketone substrate in a high ammonia 
environment, either i) in water/liquid ammonia-mixtures, or ii) in saturating 
concentrations of ammonium formate or ammonium carbonate. A sensitive 
assay can be employed to check for NADH consumption such as formation of 
formazan (X^x = 450 nm). In this embodiment, an (S)-amino acid 
dehydrogenase, either PheDH from Rhodococcus rhodocrous or LeuDH from 
Bacillus stearothermophilus, an (R)-AmDH can be developed through change of 
substrate specificity. Diversity is generated within the respective gene through 
both random mutagenesis and recombination. Selection via binding of the 
product to a nuclear receptor with subsequent transcriptional control is chosen 
as the strategy to assay for successful variants. 

Nuclear receptors PXR, BXR, and RAR can be used for engineering (R)- 
amine activated transcription with the disclosed methods and compositions. For 
example, these nuclear receptors can be engineered to activate the transcription 
of the essential metabolic gene ADE2 in response to the (R)-amines in the 
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modified Saccharomyces cerevisiae strain PJ69. PXR is chosen because of its 
broad substrate specificity. BXR is chosen because it is already known to 
activate transcription in response to amines. Random and strucuture-based 
approaches of creating libraries to engineer the nuclear receptors for (R)-amine 
5 activated growth through genetic selection can be used. Receptors for multiple 
(R)-amines will be engineered in parallel by selecting each library on multiple 
selective plates with the appropriate (R)-amine. Optionally, negative selection to 
genetically select libraries against enzymes that make an S-enantiomer product 
then select for the production of the R-enantiomer (or vice-versa) can be used. A 
10 nuclear receptor library for the (R)-amine iigand can be synthesized. 

Additionally, the (R)-amine Iigand can be synthesized in vivo by an expressed 
AmDH from the ketone precursor supplemented within the growth medium. A 
mutant PheDH library can then be screened for in vivo synthesis of (R)-amines. 
In this overall scheme, the power of genetic selection is used to detect 
15 biocatalytic synthesis of amines. Utilizing genetic selection means that each 
member of the library does not need to be screened, only ftjnctional AmDH 
appear because they allow the microbe to grow and fomi a colony. Furthemiore. 
catalysis is directly selected, as opposed to some related but indirect property 
(like transition state binding). Genetic selection coupled with the broad Iigand 
20 specificity of nuclear receptors creates a process to rapidly improve biocatalysts 
for more efficient synthesis of enantiomerically pure compounds. 

Selected transformants can be optimized through successive rounds of 
directed evolution. Further mutant libraries of PheDH/LeuDH enzymes can be 
screened for in vivo synthesis of (R)-amine. Mutant AmDH enzymes can be 
25 expressed and further studied for shifts in substrate specificity and changes in 
kinetic reaction rates. 

Fig. 10 depicts another embodiment for the identification of selective 
receptor modulators (analogous to selective estrogen modulators). In this 
embodiment, the human nuclear receptor coactivator ACTR is fused to the Gal4 
30 activation domain (ACTRiGAD). Additionally, the human nuclear receptor 
coactivator SRC1 is fused to a yeast repression domain (SRC1:RD). In the 
presence of an agonist, these coactivator fusion proteins compete for expressk)n 
of the HIS3 gene. The HIS3 gene encodes imidazoleglycerolphosphate 
dehydratase. In the presence of an agonist that recruits iKJth coactivators 
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equally, the yeast probably will produce enough histidine to survive. Adding the 
inhibitor 3-AT to the plates raises the threshold of enzyme that must be produced 
to permit growth. Compounds that selectively favor the RXR-ACTR interaction 
over the RXR-SRC-1 interaction will allow yeast to grow. 
5 Fig. 11 is a diagram of another embodiment incorporating negative 

chemical selection. Human nuclear receptor coactivator. ACTR is fused to the 
Gal4 activation domain (ACTRiGAD). The Gal4 DBD is fused to the nuclear 
receptor LBS (GBDrRXR). The Gal4 DBD binds to the Gal4 response element, 
regulating transcription to the URA3 gene. The URA3 gene codes for orotidine- 

10 5'-phosphate decarboxylase, an enzyme in the uracil biosynthetic pathway. This 
gene can be used for both positive and negative selection. For positive 
selection, yeast expressing this gene will survive in the absence of uracil in the 
media. For negative selection, 5-fluoroorotic acid (FOA) is added to the media. 
Expression of orotidine-5*-phosphate decarboxylase coverts FOA to the toxin 5 - 

15 fluorouracil, which kills the yeast. Libraries of small molecules can be screened 
in a high-throughput assay in wells containing an agonist and FOA. Antagonists 
will allow yeast to grow. 

Fig. 12 is a diagram illustrating still another embodiment comprising 
isotype specific nuclear receptor agonists are. Each isotype can be fused to a 

20 different DBD controlling expression of different genes. The isotype for which an 
agonist is sought is fused to the Gal4 DBD to control expression of i40E2 (for 
positive chemical complementation). The isotype against which selectivity is 
desired. Is fused to the GCN4 DBD to control expression of the URA3 gene (for 
negative chemical complementation). Libraries of small molecules are screened 

25 in individual wells of a 384-well plate. Compounds that do no activate the 

receptor will no allow the yeast to grow. Compounds that agonize both isotypes 
will kill the yeast. Only compounds that agonize RXRa, and either do not bind or 
antagonize RXRp will allow yeast to grow. 

Fig. 13 shows another embodiment in which a human nuclear receptor 

30 coactivator. ACTR, is fused to the Gal4 activation domain (ACTRiGAD). The 
Gal4 DBD is fused to the nuclear receptor LBD (GBDiRXR). The Gal4 DBD 
binds to the Gal4 response element, regulating transcription of the ADE2 gene. 
Upon binding of the ligand, the LBD of the nuclear receptor undergoes a 
conformational change, which recruits the ACTRiGAD fusion protein. This 
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brings the Gal4 AD and Gal4 DBD into close proximity activating transcription of 
the iAD£2 gene. For clarity only one ACTR:GAD protein is shown binding one 
GBD.RXR. Libraries of small molecules are screened in individual wells of a 
384-weH plate. Agonists will allow yeast to grow. 

5 

Materials and Methods 

Llgands. 9-cls retinoic acid (MW=304.44 g/mol) was purchased from ICN 

Biomedicals. 

LG335 Synthesis 

1 0 3-<1 <;arbbnyl)propyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydronapthyiene 

2,5-dimethyl-2.5,hexanediol (5.0 g, 34 mmol) was dissolved in anhydrous 
benzene (1 50 mL). AlCb (5.0 g. 38 mmol) was added slowly while the mixture 
was stirred in an ice bath, followed by stirring at room temperature for 1 hour. 
Another portion of AICI3 (5.0 g. 38 mmol) was then added and the reaction was 
1 5 heated to 50 "C and stirred overnight. The brown solution was poured over iced 
0.4 M HCI (50 mL) and extracted with ether (3 x 50 mL). The organic layer was 
then sequentially washed with water, saturated aqueous NaHCOa. and brine (80 
mL each) and dried (MgS04). The solvent was removed in vacuo to afford 6.2 g 
of a yellow liquid (2). 

20 The crude product was then mixed with propionyl chloride (3.2 mL, 37 

mmol) and the resulting solution added dropwise to a mixture of AICI3 (5.0g, 38 
mmol) in dichloroethane (20mL) while maintaining the temperature between 20 
and 25 '^C. The mixture was stinred for 2 hours at room temperature, at which 
point it was quenched by pouring carefully over ice. The reaction mixture was 

25 then extracted methylene chloride (3 x 10 mL). The organics layers were then 
combined, washed with water and saturated aqueous NaHCOa the volatiles 
removed by rotary evaporation. The product was purified by silica gel column 
chromatography eluting with hexanesichloroform (4:1, then 1:1) to yield 6.9 g (28 
mmol. 73%) of product as a yellow oil (3, 4). 

30 3-Propyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydronapthylene 

3-(1-Cart3onyl)propyl-5,5.8.8-tetramethyl-5.6.7.8-tetrahydronapthylene (1.0 g, 4.1 
mmol) in MeOH (10 mL). H2O (1 mL). and cone. HCI (3 drops) was treated with 
10% Pd/C (144 mg) and subjected to catalytic hydrogenation conditions at 60 psi 
while heating gently overnight. When the reaction was considered complete (Rf 
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= 0.76, 5% EtOAc in hexanes) it was filtered through a celite pad and rinsed with 
MeOH (10 mL) and hexane (50 mL). Water (1 ruL) was then added to the filtrate 
and the organic phase separated and washed with brine (2 x 20 mL). The 
aqueous layer was washed with hexanes (2 x 20 mL). The organic layers were 
5 dried (Na2S04), filtered and the volatiles removed by rotary evaporation to 
produce 510 mg (2.2 mmol, 54%) of a colorless oil (5). 
4-[(3-Propyl-5.5,8,8-tetramethyl - 

5,6J,8-tetrahydro-2-naphtyl)carbony I] benzoic Acid (LG335) 
3-Propyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydronapthylene (2.2 g. 9.5 mmol) and 

10 chloromethyl terephthalate (2.0g. 10 mmol) were dissolved in dichloroethane (20 
mL) and FeCl3(80 mg, 490 ^imol) was added. The reaction mixture was stirred 
at 75 °C for 24 hours. The reaction was then cooled and MeOH (20 mL) added. 
The resulting slurry stirred for 7 hours at room temperature, filtered and rinsed 
with cold MeOH (20 mL) to result in 2.1 g (5.5 mmol, 58%) of white crystals (6). 

15 The crystals (107 mg, 280 iimol) were stirred in MeOH (2 mL). to which 

5N KOH (0.5 mL) was added. This mixture was refluxed for 30 minutes, cooled 
to room temperature and acidified with 20% aqueous HCI (0.5 mL). The MeOH 
was evaporated and the residue was extracted with EtOAc (2x5 mL). The 
organic layers were combined and dried (MgS04) and filtered. The filtrate was 

20 treated with hexane (10 mL) and reduced in volume to 2 mL. After standing 

ovemight the resulting crystals were collected to provide 39 mg (103 |xmol, 37%) 
as a white powder (1). mp 250-252 ^C; NMR (CDCI3) 5 0.88 (t. 3H, 
•CH2CH2CH3), 1.20 (s. 6H. CH3), 1.32 (s. 6H. CH3), 1.55 (dt. 2H. -CH2CH2CH3). 
1.69(s, 4H, CH2), 2.65 (t, 2H. -CH2CH2CH3). 7.20 (s. 1H, Ar-CH) 7.23 (s, 1H. 

25 Ar-CH), 7.89 (d. 2H, Ar-CH), 8.18 (d, 2H, Ar-CH); MS (El POS) m/z mass for 
C25H30O3: Calc. 378.2189. Found 378.2195; Anal. forCssHaoOa: Calc. C:79.33. 
H:7.99. Found C:79,10. H:7.96. 

Expression Plasmids. pGADIOBAACTR. pGBT9Gal4, pGBDRXRa, pCMX- 
30 hFlXR, and pCMX-pGAL have been described. pCMX-hRXR mutants were 
cloned from pGBDRXR vectors using Sail and Psti restriction enzymes and 
ligated into similariy cut pCMX-hRXR vectors, pLuc_CRBPILMCS was 
constructed as below. All plasmids have been confirmed through sequencing. 
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pGBDRXRa was cut with Smal and Ncol. filled in. and blunt-end ligated to 
eliminate 153 amino acids of the RXR DBD. A Hindlll site in the tryptophan 
selectable marker was silently deleted and the sole remaining Hindlll site was 
cut, filled in. and blunt-end ligated to remove the restriction site. Unique Hindlll 
5 and Sad sites were Inserted into the RXR LBD gene and Mfel and EcoRI sites 
were removed from the plasmid using QuikChange Site-Directed Mutagenesis 
(Stratagene. La Jolla, CA) to create pGBDRXRaL-SH-ME. 

pLuc_CRBPII_MCS was made by site-directed mutagenesis from 
pLucMCS (Stratagene. USA). Site-directed primers were designed to 
10 incorporate a CRBPII response element in the multiple cloning site (MCS), 
controlling transcription of the firefly luciferase gene. 

Plasmids expressing the fusion protein of the Gal4 activation domain 
with the coactivators are based on the commercial plasmid pGADIO 
(Ctontech, USA). The pGADIO vector contains the Gal4 activation domain 
1 5 (residues 491 -829) fused to a multiple cloning site (MCS) and uses a leucine 
marker. Additional restriction enzyme sites were added to the MCS of the 
plasmid via site directed mutagenesis Primers were designed to add the 
following restriction enzymes: Ndel, Eagl, EclXI, Notl, Xmalll. Xmal, and 
Smal, forming a new plasmid known as pGADIOBA. (Figure 17) This plasmid 
20 was sequenced and used for specific interactk>n studies mentioned in the 
results. 

pCMX-ACTR, the expression plasmid for the human nuclear receptor 
coactivator ACTR. was a kind gift from Dr. Ron Evans (Balk Institute for 
Biological Studies, La Jolla, CA). pCR3.1hSRC-l, the expression plasmid for 

25 the human nuclear receptor coactivator SRC-1 , was a kind gift from Dr. Bert 
O'Malley (Baylor College of Medicine, Houston, TX). Both ACTR (residues 1- 
1413) and SRC-1 (residues 54-1442) genes were amplified via PGR with 
primers that contained BglU and Notl sites. The PCR products were digested 
with the two restriction enzymes and cleaned using the Zymo "DNA Clean 

30 and Concentrator Kir (Zymo Research. Orange, CA) spin columns, 

pGADlOBA was digested with Bglll and Notl and ligated with both the ACTR 
and SRC-1 products. Ligations were transformed into Z-competent (Zymo 
Research. Orange, CA) XL 1-Blue cells (Stratagene. La Jolla. CA). 
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Transformants were rescued and sequenced. The final plasmids are called 
pGADIOBAACTR and pGAD10BASRC1. 

Plasmid Construction. The zero background plasmid, pGBDRXR:3Stop, was 
5 constructed using QuikChange Site-Directed Mutagenesis with 

pGBDRXRaL-SH-ME as the template and the 3Stop insert cassette (described 
below) as primers. 

The 3Stop insert cassette was synthesized using PGR from eight 
oligonucleotides (Fig. 16). All PCRs were done using 2.5 U Pfu Polymerase 

10 (Stratagene, LaJolla, CA). 1 x Pfu buffer. 0.8 mM dNTPs, 50 ng of 

pGBDRXRaL-SH-ME as a template. 125 ng of primers and sterile water to make 
50 ]aL. First, four small cassettes were synthesized in reactions containing the 
following primers: Cassette 1. F (5'-CGGAATTTCC CATGGGC-3') (SEQ ID NO. 
1), BPf (5'-CTCGCCGAAC GACCCGGTCA CCGCATGCCA CTAGTGG-3') 

15 (SEQ ID NO. 2). and BPr (5 -CCGCTTGGCC CACTCCACTA GTGGCATGCG 
GTGACC-3') (SEQ ID NO. 3); Cassette 2, BR. BPr, SEf (5'-CGGGCAGGCT 
GGAATGAGCT CCTCGACGGA ATTCTCC-3') (SEQ ID NO. 4). and SEr 
(5 -CAGCCCGGTG GCCAGGAGAA TTCCGTCGAG GAGCTC-3') (SEQ ID NO. 
5); Cassette 3. SEf. SEr. AMf (5'-CTCTGCGCTC CATCGGGCTT 

20 AAGTGCCCAC CAATTGACAC-3') (SEQ ID NO. 6), and AMr 

(5'-CTCCAGCATC TCCATAAGGA AGGTGTCAAT TGGTGGGCAC 
TTAAGC.3') (SEQ ID NO. 7); Cassette 4, AMf. AMr, and R (5 -CAAAGGATGG 
GCCGCAG-3') (SEQ ID NO. 8). The cassettes were cleaned with either the 
DNA Clean and Concentrator-5 (Zymo Research, Orange. CA) or the Zymoclean 

25 Gel DNA Recovery Kit (Zymo Research, Orange, CA) depending on product 

purity. The four cassettes were used to make the final 3Stop insert cassette in a 
PCR that contained each cassette, primers F and R, dNTPs, Pfu Polymerase, 
and sterile water to a final volume of 50 jiL. The 3Stop cassette was cleaned 
using the Zymoclean Gel DNA Recovery Kit. 

30 

Insert Cassette Library Construction. The library of insert cassettes with 
randomized todons was constructed in a similar manner as above. The four 



21 



wo 2005/049804 PCT/US2004/038506 

cassettes (FBP. BPSE. SEAM and AMR) were made in the following ways 
(Supporting Information Fig. 7b). 

For the FBP cassette, oligos BP1 (5 -GGCAAACATG GGGCTGAACC 
CCAGCTCGCC GAACGACCCG GTCACC-3') (SEQ ID NO. 9). BP2 
5 (5'-GCCCACTCCA CTAGTGTGAA AAGCTGTTTG TO (A. C. or T)(A or G)(C or 
G)(A. C. or T)(A or G)(C or G)TT GGCA(A. C, or TKA or G)(C or G)GTT 
GGTGACCGGG TCGTTCG-3') (SEQ ID NO. 10). BP3 (5 -CTTTTCACAC 
TAGTGGAGTG GGCCAAGCGG ATCCCACACT TCTCAGAG-3*) (SEQ ID NO. 
1 1 ). and BP4 (5 -GGGGCAGCTC TGAGAAGTGT GGGATCCG-3') (SEQ ID NO. 

10 12) were mixed with TE containing 1 00 mM NaCI to bring the total volume to 50 
^L. The mixture was heated to 95 °C for 1 minute, then slowly cooled to 10 ^C. 
The annealed mixture was combined with EcoPol Buffer. dNTPs. ATP. Klenow 
(NEB. Beverly. MA). T4 DNA ligase (NEB, Beverly, MA) and sterile water to 200 
nL, and kept at 25''C for 45 min before heat inactivation at 75^0 for 20 minutes. 

15 The product was cleaned with DNA Clean and Concentrator-5 to make the BP 
cassette. Next, BP cassette was combined with Pfu Buffer. pGBDRXR:3Stop, 
oligo F, dNTPs, Pfu polymerase, and sterile water to make 50 \± for a PGR. 
The final FBP product (300bp) was purified using the Zymoclean Gel DNA 
Recovery Kit. 

20 BPSE was made in two consecutive PCRs. First. SE1 

(5'-GCAGGCTGGA ATGAGCTCCT C(A. G, or T)(C or T)(G or C)GCCTCC (A, 
G. or T)(C or T)(G or C)TCCCACC GCTCCATC-3 ) (SEQ ID NO. 13) and SE2 
(5'-CCGGTGGCCA GGAGAATTCC GTCCTTCACG GCGATGGAGC 
GGTGGG-3') (SEQ ID NO. 14) were combined with Pfu buffer. dNTPs. Pfu 

25 polymerase, and sterile water to make 50 jiL. After 5 PGR cycles, 

pGBDRXR:3Stop and BP were added to tfie reaction and the PGR was 
continued for 30 cycles. The product (240 bp) was purified using ttie Zymoclean 
Gel DNA Recovery Kit. 

SEAM was constructed in a similar way to BPSE. SE1 and SE2 were 

30 mixed with Pfu Buffer, dNTPs, Pfu polymerase, and sterile water to 25 \iL. 
Simultaneously. AMI (5-GGCTCTGCGC TCCATCGGGC TTAAGTGCCT 
GGAACAT(A. G. or T)(C or T)(G or C) TTSCTTCTTC AAGCTCATCG 
GGG-3')(SEQ ID NO. 15) and AM2 (5 -GCATCTCAAT AAGGAAGGTG 
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TCAATTGTGT GTCCCCGATG AGCTTGAAGA A-3') (SEQ ID NO, 16) were 
combined with Pfu Buffer, dNTPs, Pfu polymerase, and sterile water to 25 jiL. 
After 5 cycles, these two reactions were mixed and pGBDRXR:3Stop was 
added- The PGR was continued for 30 cycles. The PGR product (460 bp) was 
5 purified using the Zymodean Gel DNA Recovery Kit. 

The AMR cassette was made similarly to FBP. AMI and AM2 were 
mixed with TE containing 100 mM NaCI to make 50 jiL, heated to 95°C for 1 
minute, then slowly cooled to lO'^C. The annealed mixture was combined with 
EcoPol Buffer, dNTPs, Klenow, and sterile water to 200 jiL, and kept at 25*^0 for 
10 45 min before heat inactivation at ZS'^C for 20 minutes. The product (AM) was 
precipitated with isopropanol. Next, AM and R were combined with Pfu buffer, 
pGBDRXR:3Stop, dNTPs, Pfu Polymerase, and sterile water to make 50 ^L for a 
PGR. The product (140 bp) was purified using the Zymodean Gel DNA 
Recovery Kit. 

15 The four cassettes (FBP, BPSE. SEAM, and AMR) were combined in a 

PGR to rnake the library of randomized insert cassettes (6mutlG). The library 
was cleaned using Bio-Spin 30 columns (Bio-Rad Laboratories, Hercules. CA). 

Yeast selection plates and transformation. Synthetic complete (SC) media 
20 and plates were made as previously described (7). Selective plates were made 
without tryptophan (-Trp) and leucine (-Leu) or without adenine (-Ade). 
tryptophan (-Trp) and leucine (-Leu). Ligands were added to the media after 
cooling to 50 ^'C. 

The randomized cassette library was homologously recombined into the 
25 pGBDRXR:3Stop plasmid using the following method. pGBDRXR:3Stop was 
first digested with BssHII and EagI (NEB. Beverly, MA), and then treated with 
calf intestinal phosphatase (NEB. Beverly, MA), to make a vector cassette. 
Vector cassette (1 ng) and 6mutlC (9 ng) were transformed according to Geitz's 
transformation protocol (8) on a 1 0X scale into the PJ69-4A yeast strain, which 
30 had previously been transfomned with a plasmid (pGADIOBAACTR) (manuscript 
submitted) expressing the nudear receptor coactivator ACTR fused to the yeast 
Gal4 activation domain. Homologous regions between the vector cassette and 
ttie insert cassette allow the yeast to homplogously recombine the insert 
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cassette with the vector cassette forming a circular plasmid with a complete RXR 
LBD gene. The transformation mixture (1 mL) was spread on each of 10 large 
plates of SC -Ade -Trpi -Leu media containing 10 ^iM LG335. The transformation 
mixture (2 and 20 nL) was also spread on SC -Trp -Leu media. These plates 
5 were grown for 4 days at 30 "C. 



Molecular Modeling. Docking of LG335 in to modified binding pockets was 
done using the Insightll module Affinity. The wild type RXR with 9cRA crystal 
structure (9) was modified using the Biopolymer module residue replace tool to 

10 make mutations in the binding pocket that corresponded to the mutations in 
variants 1268:1 130A;F313A;L436F. I268V;A272V;I310L;F313M. and 
I268A;I310S:F313A;L436F. The ligand was placed in the binding pocket by 
superimposing the cartxjxylate carbon and two carbons in the 
tetrahydronapthalene ring of LG335 onto corresponding carbons of 9cRA in the 

1 5 crystal structure. A Monte Carlo simulation was perfonmed first, followed by 
Simulated Annealing of the best docked conformations. 



Library Evaluation 

To evaluate the efficiency of library creation and selection we take a binary 
20 approach- either the sequence is or is not a designed sequence. Eq. 1 is the 
relevant binomial distribution for statistical evaluation of the libraries. 

{N-iy. )^-* (1) 

ik-mN-k)r 

In Eq: 1 W is the number of sequenced plasmids; k is the number of background 
or designed plasmids; p Is the frequency of the occurrence of either background 
or designed plasmid; and P is the measure of certainty. Applying Eq. 1 to the 
25 libraries, we conclude with 95% certainty that the unselected library is at least 
72% background and the selected library is at least 78% designed sequences. 

Genotype Determination. Plasmids were rescued using either the Powers 
method (www.fhcrc.org/labs/gottschling/yeast/yplas.html) or the Zymoprep Kit 
30 (Zymo Research. Orange. CA). The plasmids were then transformed into Z- 

competent (Zymo Research. Orange. CA) XLI-Blue cells (Stratagene. la Jolla. 
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CA). The QIAprep Spin Miniprep Kit (Qiagen. Valencia, CA) was used to purify 
the DNA from the transformants. These plasmids were sequenced. 

Quantitation Assays 
5 Solid Media. The rescued plasmids were transformed into PJ69-4A 

containing the pGADIOBAACTR plasmid and plated on (SC) -Trp -Leu media. 
These plates were grown for 2 days at 30 ""C. 

Colonies were streaked onto the following media: SC. SC -Trp -Leu, SC 
-Ade -Trp -Leu, SC -Ade -Trp -Leu plus increasing concentration of LG335 or 
10 9cRAfrom 1 nMtolOKiM. 

Liquid Media. The method used for quantitation was modified from a method 
developed by Miller and known in the art, 

1 5 Mammalian Luciferase Assay. Performed with HEK 293 cells as previously 
described, and known in the art. 

Streaking cells onto adenine selective plates using PJ69'4A, 

Yeast transformants containing the plasmids were streaked onto the 

20 selective plates (SC -Ade) with different ligand concentrations using sterile 
toothpicks. Plates were divided into sectors for the samples and controls; the 
control sectors contain pGBDMT and pGBT9Gal4. The same colony was 
used for streaking on all the plates, ending with a SC plate to confirm efficient 
transfer of the cells to each plate. Both selective and non-selective plates 

25 were incubated at 30 ''C for two days. Each set of genetic selection plates 
was replicated at least once. ^ 

Streaking cells onto FOA plates using MaVW3 

Yeast transformants containing the plasmids were streaked onto 
30 selective plates, SC -Leu-Trp, containing 5-fluororotic acid, FOA, and 

different ligand concentrations. Plates were also divided into sectors, with 
pGBT9Ga!4 and pGBDMT as controls. The same procedure was used for 
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Streaking as for the adenine selection plates. Plates were incubated for two 
days. Each set of the genetic selection plates was replicated at least once. 

EXAMPLES 

Example 1 : Library Design 
5 The binding pocket of the RXR LBD is composed of primarily hydrophobic 

side chains plus several positively charged residues that stabilize the negatively 
charged carboxylate group of 9cRA. The target ligand, LG335. contains an 
analogous carboxylate group, so the positively charged residues were left 
unchanged. We hypothesized that binding affinity arises from hydrophobic 
10 contacts and that specificity arises from binding pocket size, shape, hydrogen 
bonding, and electrostatics. The randomized amino acids were chosen based on 
their proximity to the bound 9cRA as observed in the crystal structure and the 
results of site directed mutagenesis (supporting information Fig. 14). The 
electrostatic interactions were held constant while the size, shape, and potential 
1 5 hydnjgen bonding interactions were varied to find optimum contacts for LG335 
binding. A library of RXRs with mutations at six positions was created. At three of 
the positions (1268. A271 . and A272) are four possible amino acids (L. V. A. and 
P) and at the other three positions (1310, F313. and L436) there are eight 
possible amino acids (L. I. V. F, M, S. A, and T). The combination of six positions 
20 and number of encoded amino acids allowed testing of the library constniction 
while keeping the library size (32,768 amino acid combinations and ~3 million 
codon combinations) within reasonable limits. Proline was included in the library 
as a negative control. Residues 268, 271 , and 272 are in the middle of helix 3. 
which would be dismpted by the inclusion of proline. Therefore, proline residues 
25 should appear at these positions only in unselected variants and not in the 
variants that activate in response to ligand. The substitutions at positions 268. 
271 , and 272 were restricted to small amino acids allowing access to the 
positively charged residues at this end of the pocket. 

To eliminate contamination of the library with unmutated. wild-type RXR 
30 the gene was modified to create a non-functional gene. RXR:3Stop. Forty base 
pairs were deleted at three separate sites producing three stop codons in the 
coding region to create this nonfunctional gene. The deletions coaespond to 
regions in the RXR gene where randomized codons are designed. This plasmid. 
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pGBDRXR:3Stop, was cotransformed into yeast with the library of insert 
cassettes containing full-length RXR LBD genes with randomized codons at 
positions 268. 271, 272, 310. 313. and 436. The insert cassettes and the plasmid 
contain homologous regions enabling the yeast to homologously recombine the 
5 cassette into the plasmid. Recombination repairs the deletions in the RXR:3Stop 
gene to make full-length genes with mutations at the six specific sites. 

Example 2: Library selection. 

To limit the number of variants to be screened, the library was subjected 

10 to chemical complementation (Fig. 1). Chemical complementation exploits the 
power of genetic selection to make the survival of yeast dependent on the 
presence of a small molecule. The PJ69-4A strain of S- cerevisiae has been 
engineered for use in yeast two-hybrid genetic selection and screening assays. 
For selection. PJ69-4A contains the ADE2 gene under the control of a Gal4 

1 5 response element. Plasmids created through homologous recombination in 
PJ69-4A express the Gal4 DBD fused with a variant RXR LBD (GBD:RXR). A 
plasmid expressing ACTR, a nuclear receptor coactivator, fused with the Gal4 
activation domain (ACTR:GAD), was also transformed into PJ69-4A. If a ligand 
causes a variant RXR LBD to associate with ACTR, transcription of the ADE2 

20 gene is activated. Expression of ADE2 permits adenine biosynthesis and 
therefore, yeast survival on media lacking adenine. 

A small amount of the yeast library was plated onto media (SC -Leu -Trp) 
selecting only for the presence of the plasmids pGADIOBAACTR (expressing 
ACTRrGAD and containing a leucine selective marker) and mutant pGBDRXR 

25 (expressing variant GBDiRXR and containing a tryptophan selective marker). 
The majority of the yeast cells transformed with the RXR library were plated 
directly onto SO -Leu -Trp -Ade media containing 10 ^iM LG335. selecting for 
adenine production in response to the compound LG335. The transformation 
efficiency of this library into yeast strain PJ69-4A was 3.8 x 10"* colonies per jig 

30 DNA. This numt>er includes both the efficiency of transforming the DNA into the 
cells and the homologous recombination efficiency. Of the approximately 
380,000 transformants, approximately 300 grew on SC -Ade -Trp -Leu + 10 
LG335 selective media. 
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Example 3: Library Characterization. 

Twenty-one plasmids were rescued from yeast colonies: nine from non-selective 
plates (SC -Trp -Leu) and twelve from selective plates (SC -Ade -Trp -Leu + 10 
ixM LG335). The relevant portion of plasmid DNA from these colonies was 
sequenced to determine the genotype (Table 1 ). All nine of the plasmid 
sequences from the non-selective plates contained at least one deletion and are 
non-functional genes. Of the twelve plasmids that grew on the selective media, 
all contain full-length RXR LBDs with designed mutations. With 95% certainty, 
we conclude that the unselected library is at least 72% background and the 
selected library is at least 78% designed sequences (supporting information). 
Table 1. Genotypes of mutants from unselected and selected libraries 



Mutant 



1268 



A271 



A272 



1310 



F313 



L436 



1 

2 
3 
4 
5 
6 
7 
8 
9 

1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 



Unselected library 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

GTA(V) CCT(P) CCT(P) TCG(S) 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

Deleted Deleted Deleted Deleted 

Selected library 

GTG(V) wtRXR OCA TTG(L) 

GTG(V) WtRXR GCA GTG(V) 

CTA(L) GOT GCA ATG(M) 

GCG(A) WtRXR GCA TCC(S) 

GCT(A) GCT GCA GCC(A) 

GCT(A) GCT GTT(V) GCC(A) 

CTT(L) GCT GCT GTC(V) 

CTG(L) GTG(V) GCG TTG(L) 

GTG(V) GTG(V) GCG TTG(L) 

GTA(V) WtRXR GTG(V) ATG(M) 

GCG(A) GCG GCA ATG(M) 

GCG(A) GCT GCG TCG(S) 



Deleted 
Deleted 
TCG(S) 
Deleted 
Deleted 
Deleted 
Deleted 
Deleted 
Deleted 

ATG(M) 

TCC(S) 

GTG(V) 

GTG(V) 

GCG(A) 

GCG(A) 

ATC(I) 

TTG(L) 

GTG(V) 

TCC(S) 



Deleted 

Deleted 

Deleted 

Deleted 

GCG(A) 

Deleted 

Deleted 

Deleted 

TTC(F) 

TTG 

TTG 

TTG 

TTC(F) 

TTC(F) 

TTC(F) 

TTG 

TTG 

TTG 

ATG(M) 



GCG(A) ACG(T) 
GTC(A) TTC(F) 



Sequences condons are followed by the encoded amino 
"WtRXR" indicates ttiat the sequence corresponds to the 
"Deleted" indicates the presence of an unmutated 35top 
cassette. 



acid in parentheses, 
wild-type RXR condon. 
deletion background 
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Example 4: Variant Characterization in Yeast. 

The twelve plasmids rescued from the selective plates were 
retransformed into PJ69-4A to confirm that their phenotype is plasmid linked. 
The strain PJ69-4A was engineered to contain a Gal4 response element 
5 controlling expression of the LacZ gene, in addition to the ADE2 gene. Both 
selection and screening were used to determine the activation level of each 
variant by 9cRA and LG335. The selection assay quantifies yeast growth 
occurring through transcriptional activation of the ADE2 gene, while the screen 
quantifies p-galactosidase activity occurring though transcriptional activation of 

10 the LacZ gene. Although the selection assay (Fig. 2) is --10-fold more sensitive 
than the screen (Fig. 3). it does not quantify activation level (efficacy) as well as 
the screen. In the selection assay, there is either growth or no growth, whereas 
the screen more accurately quantifies different activation levels at various 
concentration of ligand (Figs. 2 and 3), The differences will be more fully 

15 discussed in a future publication. 

Three plasmids were used as controls in the screen and selection assays. 
The plasmids pGBDRXRa and pGBT9Gal4 were used as positive controls to 
which the activation level of the variants can be compared. pGBDRXRa 
expresses the gene for the "wild-type" GBD:RXR, which grows and is activated 

20 by 9cRA but not by LG335. pGBT9Gal4 expresses the gene for the ligand- 

independent yeast transcription factor Gal4 (25), which is constltutively active in 
the presence or absence of either ligand. The plasmid pGBDRXR:3Stop serves 
as a negative control. pGBDRXR:3Stop carries a non-functional RXR LBD gene; 
therefore, yeast transformed with this plasmid does not grow in the selection 

25 assay nor show activity in the screen. This plasmid provides a measure of 
background noise in both the selection and screen assays. 

Both the selection and screen assays show that ten of the twelve variants 
are selectively activated by LG335. Results of these assays are shown in figures 
2 and 3. Table 2 summarizes the transcriptional activation profiles of all twelve 

30 variants in response to lx)th 9cRA and LG335 compared to wild-type RXR. 
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Table 2. EC50 and efficacy in yeast and HEK 293 cells for RXR variants 





9CRA 


LG335 


Yeast 


HEK 293 


Yeast 


hek: 


293 


Variant 




Eff 




Eff 


ECso 


Eff 


EC50 


Eff 


WT 




100 


220 


100 


>10.000 


10 


300 


10 


I268A;1310A;F313A; 
L436F 




0 


> 10.000 


0 


220 


70 


30 


50 


I268V;A272V;I310L;F313M 


>10 000 


10 


1.600 


30 


40 


60 


1 


30 


1268 A; 131 OS ; F31 3 V; L436F 


>in noo 


10 






470 


60 


_ 




I268A;I310S;F313V;L436F 


>10.000 


0 


> 10,000 


0 


430 


50 


690 


20 


1268V; A272V;I31 OM ;F31 3S; 
L436M 


>10.000 


10 


>1 0.000 


0 


680 


30 


180 


30 


1268 A; A272V;I310 A;F31 3A; 
L436F 


>10.000 


0 






530 


30 


1 




I268L; A271 V;I31 0L;F31 3L 


>1 0,000 


0 






530 


20 


1 




I268A;I310M;F313A;L436T 


>10.000 


0 


>1 0.000 


0 


610 


10 


140 


20 


I268V;A271V;I310L;F313V 


>1 0.000 


0 






650 


10 






I268L;I310V;F313I 


> 10.000 


0 






>2000 


10 






I268L;I310M;F313V 


>10.000 


20 






610 


20 






I268V;I310V;F313S 


> 10.000 


0 






440 


10 







EC50 values (given in nm) represent the averages of two screen experiments in 
5 quadniplicate for yeast and in triplicate for HEK 293. Efficacy (Eff; given as a 
percent) is the maximum increase in activation relative to the increase in 
activation of wild type with 10 pM 9cRA. Values represent the averages of two 
screen experiments in quadmpiicate for yeast and in triplicate in HEK 293. 



10 Five variants were chosen for testing in mammalian cell culture for 

comparison of the activation profiles (I268A;I310A;F313A;L436F, 
I268V;A272V;I310L;F313M. I268A;I310S;F313A;L436F. 

I268V;A272V;I310M;F313S;L436M. and I286A.I310M;F313A;L436T). The genes 
for these variants were removed from yeast expression plasmids and ligated into 

15 mammalian expression plasmids. 

Although I268L;I310M;F313V is constitutively active in the selection assay 
(Fig. 2n) and has high basal activity in the screen assay, l)oth 9cRA and LG335 
increase activity at micromolar concentrations (Fig. 3n). This variant may be in 
an intemiediate conformation, with weakly activated transcription that can be 

20 improved by ligand binding. The high basal activation could also be due to a 
change in the conformation equilibrium with a shift towards the active 
conformation when ligand is not present. 
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I268\/;I310V;F313S is constitutively active on solid media (data not 
shown), but shows no activation in the screen (0% Eff.. Table 2. Fig. 3o) and 
only grows in the liquid media selection after two days (Fig. 2o). The basal 
activation level may be below the threshold of detection for the liquid media 
5 assays. However, it is also possible that agar, which is not present in the liquid 
assays, contains some small molecule that activates the receptor. 

Activation levels and EC50S correlate in yeast and HEK 293 cells (Fig. 4 
and Table 2). For the majority of the variants 9cRA shows little or no activation in 
yeast or mammalian cells. Variant I268V;A272V;I310L;F313M is activated 

10 slightly by 9cRA in yeast, but in mammalian cells is activated to the same level 
as with both 9cRA and LG335 (Figs. 2, 3 and 4). With one exception, ail variants 
tested have EC50S within 10-fold in yeast and mammalian ceils. However, the 
EC50S in mammalian cells are generally lower than in yeast. We speculate that 
this shift is due to increased penetration of LG335 into mammalian cells versus 

15 yeast. 

Subtle differences in binding pocket shape can have a drastic effect on 
specificity. For example, the I268V;A272V;I310L;F313M variant is activated to 
high levels by LG335 (60% Eff. Table 2). and is only slightly activated by 10 |xlVI 
9cRA In yeast (Fig. 3e), yet the amino acid changes are extremely conservative. 

20 The volume difference between phenylalanine and methionine side chains is 
only - 4 and their polarity difference is minimal (hydration potentials of the 
methionine and phenylalanine side chains are -0.76 kcal mol'^ and -1,48 kcal 
mor\ respectively). The other mutations redistribute methyl groups within the 
binding pocket, with a net difference of one methyl group (-18 A^). 

25 The LG335-I268V;A272V;I31 0L;F31 3M ligand receptor pair also 

represents a 25-fold improvement in EC50 over the previous best LG335 
receptor, Q275C;I310M;F313I (40 nM vs. 1 in yeast). The 
Q275C;I310M;F313I variant was created using site directed mutagenesis. Subtle 
changes in the I268V;A272V;I310L;F313M variant produced a better ligand 

30 receptor pair than the Q275C;I310M;F313I variant. This conclusion is consistent 
With the observation that nuclear receptors bind ligands through an induced-fit 
mechanism. With current knowledge about protein-ligand interactions it is not 
possible to rationally design ligand-receptor pairs with specific activation profiles. 
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Libraries and chemical complementation are a new way to circumvent this 
problem and obtain functional variants with a variety of activation profiles. 

Molecular modeling was used to generate hypotheses about the structural 
basis of ligand specificity for the variants discovered in the library. First, 

5 mutations to smaller or more flexible side chains at positions 310, and 31 3 are 
essential to provide space for the propyl group of LG335. All variants activated 
by LG335 have mutations at these two positions. Second, mutations to amino 
acids with larger side chains at position 436 stericly dash with the methyl group 
at the 9 position of 9cRA. This interaction may prevent helix 12 from closing 

1 0 properly and therefore prevent activation by 9cRA. The only variant significantly 
activated by 9cRA (I268V;A272V;I310L;F313M> does not contain a mutation at 
position 436. Third we hypothesize that tight packing in the binding pocket may 
lead to lower ECsoS. The docking results for I268V;A272V;I310L;F313M with 
LG335 show that the methionine and leucine side chains pack tightly against the 

1 5 propyl group of LG335, which may result in tighter binding and consequently a 
lower ECsos. 

In the absence of functional data, chemical complementation may be 
used to test more hypotheses about the function of particular residues than 
wouW be possible through site directed mutagenesis. By making a library of 

20 changes at a single site, additional information could be obtained about the 
importance of side chain size, polarity, and charge over just the traditional 
mutation to alanine that is often used to explore single residue importance. In the 
absence of stnjctural information, it is possible to make large libraries using error 
prone PGR or gene shuffling. Chemical complementation could also be used to 

25 select active variants from these types of libraries. 

Example 5: Increasing the Sensitivity of Chemical Complementation with 

ACTR. 

To increase the sensitivity of chemical complementation, an adapter 
protein was introduced to link the mammalian nudear receptor function to the 
30 yeast transcription apparatus, thereby overcoming the evolutionary 
divergence between mammalian cells and yeast. The human nuclear 
receptor coactivator ACTR was fused to the yeast Gal4 activation domain 
This plasmid, pGADIOBAACTR, expresses the ACTR:GAD fusion protein 
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and contains a leucine marker. This plasmid was co-transformed into yeast 
with the plasmid pGBDRXR, which expresses the Gal4 DNA binding domain 
(DBD) fused to the RXR ligand binding domain (GBD:RXR) and contains a 
tryptophan marker. Transformants were selected on SC -Leu-Trp plates, and 
5 were streaked onto adenine selective plates (SC -Ada) containing 

10'^ M 9cRA. a known ligand for RXR (Figure 5G).. Yeast containing just the 
pGBDRXR plasmid, the pGADIOBAACTR plasmid. a plasmid with just the 
GaWDBD (pGBDMT), and a plasmid containing the Gal4 holo protein 
(pGBT9Gal4) were also streaked onto these plates as controls. 

10 After two days of incubation, growth occurs on the sector of the plate 

containing ACTRrGAD with GBD:RXR and on the sector of the plate with 
Gal4; whereas no growth occurs on the sector of the plate with GBDiRXR 
alone (Figure 5G). The growth density produced by GBDiRXR and 
ACTRiGAD is the same as the growth produced by the holo Gal4. 

15 Importantly, GBDiRXR and ACTRiGAD produced no growth on plates without 
9cRA. 

Previous findings showed no growth was observed with RXR at 
9cRA concentrations lower than 10*^ M. To determine if the sensitivity of 
our system had increased with the introduction of the adapter fusion 

20 protein, a dose response was performed on adenine selective plates (SC 
-Ade) containing ligand concentrations ranging from 10'^M to 10'^M. After 
two days of incubation, a clear dose response occurs on the plates 
(Figure 5). Without ligand, growth occurs only on the Gal4 sector of the 
plate, as expected At concentrations as low as 10*^ M 9cRA, ligand- 

25 activated growth occurs only on the sector of the plate containing lx>th 
GBD:RXR with ACTR:GAD (Figure 5D). At concentrations of ligand 
above 10"^ M, higher density growth is observed on the sector of the plate 
containing GBDiRXR with ACTRiGAD. No growth occurs with GBDiRXR 
alone as expected. In summary, the introduction of the fusion protein 

30 ACTRiGAD increases the sensitivity of chemical complementation. 

Growth occurs on adenine selective plates with 9cRA after two days of 
incubation (Figure 5). Ligand-activated growth is observed at 9cRA 
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concentrations as low as 10"® M 9cRA. With chemical complementation, 
an approximate EC50 value between 10"® M and 10'^ M for wild-type RXR 
and 9cRA, which is comparable to the EC50 value measured for wild-type 
RXR in mammalian cell assays (~10 ' M) (Figure 5). The growth density 

5 and rate with the ACTR:GAD fusion protein is comparable to Gal4 

activated growth. The same results were obtained on adenine selective 
plates (SC -Ade-Trp and SC -Ade-Leu-Trp) and on histidine selective 
plates (data not shown). In summary, introducing an adapter fusion 
protein of the human coactivator with the Gal4 activation domain 

1 0 increases the sensitivity of chemical complementation 1 000-fold, making 
this system more efTicient for analysis of protein/ligand interactions. 
Example 6: Increasing Sensitivity of Chemical Complementation using 
SRC-1 

Another RXR coactivator was tested to increase the sensitivity of 
1 5 chemical complementation. Residues 54 to 1 442 of the human nuclear 
receptor coactivator. SRC-1 . were fused to the Gal4 activation domain to 
construct the plasmid pGADIOBASRCI. This plasmid. which expresses 
SRC1 :GAD in yeast and contains a leucine mariner was transfonned with 
GBDiRXR; transfomnants selected from SC -Leu-Trp were streaked onto 
20 adenine selective plates (SC -Ade) with various concentrations of 9cRA 

(Figure 6). Ligand-activated growth is observed only in the sector of the plate 
containing both GBDiRXR with SRCi:GAD. and the same trend is observed 
with SRC-1 as the ACTR coactivator (Figure 6). 

To verify that the increased sensitivity is from specific interactions 
25 between the coactivator and the active conformation of the receptor, a series 
of further controls was devised. pGADIO, a plasmid containing the GaW 
activation domain (GAD) without a coactivator domain was cotransformed 
with pGBDFJXR- The plasmid was also transformed alone. pGADIOBAACTR, 
pGADIOBASRCI, pGBTQGaW. and pGBDMT were all transformed 
30 individually. These controls were streaked onto adenine selective plates (SC - 
Ade) with and without 9cRA.O In the absence of ligand. only the entire GaW 
gene (pGBT9Gal4) grows as expected (data not shown). In the presence of 
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10'^ M 9cRA. growth occurs with the GBDiRXR with ACTR:GAD and 
GBDiRXR with SRC1:GAD. The Gal4 AD only (without the coactivator 
domain) with GBD:RXR displays no growth. These results verify that the 
increase in chemical complementation is specifically due to the interaction of 
5 the coactivator fusion protein with the ligand-bound nuclear receptor (data not 
shown). 

Example 7:Chemical complementation and negative selection 

Negative selection is the opposite of classical genetic 
complementation. Instead of allowing the microbe to survive, a functional 

10 gene kills the microbe; only cells containing non-functional genes survive and 
form colonies on selective plates. Negative selection is useful for finding 
mutations that disrupt the function of a protein. 

For negative selection in yeast, others have generated yeast strains 
that contain Gal4 response elements (REs) fused to the URA3 gene. The 

1 5 URA3 gene codes for or orotidine-5'-phosphate decarboxylase, an enzyme in 
the uracil biosynthetic pathway. This gene can be used for both positive and 
negative selection. For positive selection, yeast expressing this gene will 
survive in the absence of uracil in the media. For negative selection, uracil 
and 5-fluoroorotic acid (FOA) is added to the media. Expression of orotidine- 

20 5 -phosphate decarboxylase coverts FOA to the toxin 5-fluorouracil, which 
kills the yeast. As used herein, the term "negative chemical 
complementation" refers to negative selection that occurs due to the 
presence of a small molecule. 

Plasmids pGBDRXR and pGADI OBAACTR were individually 

25 transformed and co-transformed into MaV103. Transformants were streaked 
onto uracil selective plates (SC -Ura-Trp) with 9cRA for positive selection 
(data not shown). The same trend was seen with the ACTRiGAD with 
GBDiRXR in the MaV103 strain as seen previously with the PJ69-4A strain. 
The same transformants were streaked onto selective plates (SC -Leu-Trp) 

30 with FOA for negative chemical complementation. Varying concentrations of 
9cRA were also added to the plates, ranging from 10*^ M to 10"® M. In the 
absence of ligand (Figure 7B), yeast grow on the sector of the plate 
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containing ACTRiGAD with GBDiRXR as expected. This is expected 
because uracil is provided, and in the absence of ligand RXR maintains its 
inactive conformation, preventing ACTRiGAD from binding and transcription 
does not occur. Without expression of the URA3 gene, 5-fluorouracil is not 
5 produced and the yeast survive. However, as the concentration of ligand 
increases (Figure 7B-7F), less growth occurs and at the highest 
concentration of ligand, 10"^ M, very little growth occurs. The small amount of 
growth that is observed is due to background growth associated with negative 
selection in this strain. 

1 0 Negative chemical complementation is advantageous for engineering 

receptors for new small molecules for several reasons. First, mutant receptor 
libraries may contain constitutively active receptors or receptors that activate 
transcription in response to endogenous small molecules. These undesirable 
receptors can be removed from the library with negative selection. Second, 

15 in some cases it will be desirable to remove members of the library that 
activate in response to certain small molecules, e.g. the natural ligands. 
Negative chemical complementation will remove these members of the 
library. The remaining library can then be put through chemical 
complementation with the small molecule of interest. Third, for enzyme 

20 engineering negative chemical complementation can remove library members 
that produce a particular small molecule, e.g. an enantiomer of the compound 
of Interest. The remaining mutant enzyme library can then be put through 
chemical complementation to find those capable of producing the small 
molecule of interest. Fourth, for drug discovery, chemical libraries can be 

25 efficienUy evaluated for antagonists of nuclear receptors by their ability to 
allow the yeast to survive negative chemical complementation. 

Example 8: Chemical complementation with RXR mutants. 

Several FIXR mutants previously tested in both mammalian cell assays 
30 and with chemical complementation in yeast (without the coactivator fusion 
protein) showed a general, but less than complete correlation. Without the 
coactivator fusion protein, ligand-activated growth was observed only with 
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wild-type RXR and the F439L mutant after five days of incubation; none of 
the other mutants showed ligand-activated growth. The variation in the 
transcription machinery could lead to the different patterns in activation. To 
test whether the adapter fusion protein could overcome the differences and 
5 show a more direct correlation, all the mutants in Table 3 were cloned into 
pGBD vectors and cotransformed into yeast with pGADIOBAACTR. Again, 
transformants were selected from SC -Leu-Trp plates and then streaked onto 
adenine selective plates (SC -Ade-Trp). These mutants were tested with 
9cRA and LG335 (a near-drug, a synthetic compound structurally similar to 

10 an RXR agonist but that does not activate wild-type RXR) (Table 3). 

The transcriptional activation patterns of these mutants in chemical 
complementation with the addition of ACTR:GAD was observed on dose 
response plates containing both 9cRA and the synthetic ligand, LG335 
(Figure 8). On the plate without ligand, growth occurs on the sector of the 

15 plate containing Gal4, but growth also occurs on the sector of the plate with 
the two mutants F313I and F313I;F439L, This could be a result of the 
mutations causing a structural modification to the binding pocket that is 
favorable for the binding of an endogenous small molecule in yeast. At 10'^M 
9cRA, growth occurs on the sectors of the plate with the single mutants, 

20 C432G. Q275C. I268F, I310M, V342F. and F439L. as well as some of the 
triple mutants I310M;F313I;F439L and Q275C;F313I:V342F. As the 
concentration of ligand decreases, some mutants no longer show ligand- 
activated growth. At 10'^ M 9cRA, growth is observed with the F439L mutant 
as well as wild-type RXR (Figure 8). At the lowest concentration of ligand, 

25 10"® M 9cRA. growth is observed in the Gal4 and F313I sectors of the plates. 
For the synthetic ligand LG335, growth is observed with several of the single, 
double and triple mutants at 10'^ M (Figure 8). At lower concentrations of 
ligand, the single mutants do not show much growth. However, several of the 
double and triple mutants I310M;F313I:F439L, Q275C;F313I. and 

30 I310M;F313I display ligand-activated growth at 10"^M LG335. At 10"® M 

LG335, some growth is still observed in the I310M;F313I;F439L sector of the 
plate. 
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A correlation is apparent between yeast growth and transcriptional 
activation in mammalian cells when quantitating these results and comparing 
them with results from cell culture assays (Table 3). The I268F, Q275C. 
C432G. I310M, and I310M; F313I; F439L mutations which had previously not 

5 shown any growth with chemical complementation, grow with the ACTR.GAD 
fusion protein (Figure 8). The more direct correlation between chemical 
complementation and mammalian cell assays shows that the coactivator 
fusion protein (ACTR:GAD) serves to bridge millions of years of evolution by 
adapting mammalian nuclear receptor function to the yeast transcription 

10 machinery. 
DefiniUons 

As used herein, the term "polynucleotide" generally refers to any 
polyribonucleotide or polydeoxribonucleotide, which may be unmodified RNA or 
DNA or modified RNA or DNA. Thus, for instance, polynucleotides as used 
1 5 herein refers to, among others, single-and double-stranded DNA. DNA that is a 
mixture of single-and double-stranded regions, single- and double-stranded 
RNA, and RNA that is mixture of single- and double-stranded regions, hybrid 
molecules comprising DNA and RNA that may be single-stranded or, more 
typically, double-stranded or a mixture of single- and double-stranded regions. 
20 The tenns "nucleic acid." "nucleic acid sequence." or "oligonucleotide" also 
encompasses a polynucleotide as defined atxjve. 

In addition, polynucleotide as used herein refers to triple-stranded regions 
comprising RNA or DNA or both RNA and DNA. The strands in such regions 
may be from the same molecule or from different molecules. The regions may 
25 include all of one or more of the molecules, but more typically involve only a 
region of some of the molecules. One of the molecules of a triple-helical region 
often is an oligonucleotide. 

It will be appreciated that a great variety of modifications have been made 
to DNA and RNA that serve many useful purposes known to those of skill in the 
30 art. The term polynucleotide as it is employed herein embraces such chemically, 
enzymatically or metabolically modified forms of polynucleotides, as well as the 
chemical forms of DNA and RNA characteristic of vimses and cells, including 
simple and complex cells, inter alia. 
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The term "oligonucleotide" refers to relatively short polynucleotides. 
Typically the term refers to single-stranded deoxyribonucleotides, but it can refer 
as well to single-or double-stranded ribonucleotides. RNA:DNA hybrids and 
double-stranded DMAs, among other compounds containing multiple nucleotides 
5 linked through phosphodiester bonds. The phosphodiester bonds are typically 5*- 
3' linkages between the deoxyribose or ribose sugars of adjacent nucleotides, 
which is the predominant mode of nucleotide coupling in natural DNA or RNA, 
respectively. The nucleotides of an oligonucleotide can be the naturally occurring 
ribonucleotides, rA, rC. rG and rU; deoxyribonucleotides, dA. dC, dG and dT; or 

1 0 other compounds in which the backbone and/or the base moieties differ from the 
standard nucleotides of DNA and RNA. 

The term "non-natural" means not typically found in nature including those 
items modified by man. Non-natural includes chemically modified subunits such 
as nucleotides as well as biopolymers having non-natural linkages, backbones, 

15 or substitutions. 

The temi "non-natural backbone" means a covalent chemical linkage that 
couples together two or more nucleotides in a manner that is not identical to the 
naturally-occurring RNA or DNA phosphodiester backbones. Chemical 
deviations from the natural backbone can include, but are not limited to, 

20 chemical modification of a single site on the natural backbone or the 

replacement of a component of the backbone with a completely different 
chemical group. Methylation of the 02* site on the ribose sugar is an example of 
a chemical difference from the natural backbone that would constitute a non- 
natural backbone. Replacement of the ribose sugar with a hexose sugar and/or 

25 replacement of the phosphate group in DNA or RNA with a phosphorothioate 
group are also examples of non-natural backbones. Exemplary modified 
oiigonucleotide backbones include, for example, phosphorothioates, chiral 
phosphorothioates, phosphorodithioates, phosphotriesters, 
aminoalkylphosphotriesters, methyl and other alkyi phosphonates including 3'- 

30 aikylene phosphonates, 5'-alkylene phosphonates and chiral phosphonates, 
phosphinates, phosphoramidates including 3*-amino. phosphoramidate and 
aminoalkylphosphoramidates, thionophosphoramidates, 
thionoalkylphosphonates, thionoalkylphosphotriesters, selenophosphates and 
t>orano-phosphdtes having normal 3-5' linkages, 2'-5* linked analogs of these. 
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and those having inverted polarity wherein one or more intemucleotide linloges 
is a 3' to 3*. 5' to 5' or 2* to 2' linkage. Representative oligonucleotides having 
inverted polarity comprise a single 3* to 3* linkage at the 3"-most intemucleotide 
linkage i.e. a single inverted nucleoside residue which may be abasic (the 
5 nucleobase is missing or has a hydroxyl group in place thereof). 

Some oligonucleotide backbones do not include a phosphorus atom 
therein and have backbones that are formed by short chain alkyi or cycloalkyi 
internucleoside linkages, mixed heteroatom and alkyI or cycloalkyi 
internucieoside linkages, or one or more short chain heteroatomic or heterocyclic 
10 internucleoside linkages. These include those having morpholino linkages 
(formed in part from the sugar portion of a nucleoside); siloxane backbones; 
sulfide, sulfoxide and sulfone backbones; formacetyl and thioformacetyl 
backbones; methylene formacetyl and thipfomiacetyl backbones; riboacetyi 
backbones; alkene containing backbones; sulfamate backbones; 
1 5 methyleneimino and methylenehydrazino backbones; sulfonate and sulfonamide 
backbones; amkle backbones; and others having mixed N, O. S and CH2 
component parts. 

Some embodiments synthesize or use oligonucleotides with 
phosphorothioate backbones and oligonudeosides with heteroatom backlwnes, 
20 and in particular -CH2-NH-O-CH2-. -CH2-N(CH3)-0-CH2- [known as a 
methylene (methylimino) or MMI backbone], -CH2-0-N(CH3)-CH2-. -CH2- 
N(CH3)-N(CH3)-CH2- and -0-N(CH3)-CH2-CH2- [wherein the native 
phosphodiester backbone is represented as -O-P-O-CHH of the above 
referenced U.S. Pat. No. 5,489,677, and the amide backbones of the aljove 
25 referenced U.S. Pat. No. 5,602.240. 

In other embodiments, the disclosed methods and compositfons may 
comprise modified oligonucleotides containing one or more substituted sugar 
moieties. Other modified oligonucleotides comprise one of the following at the 2' 
position: OH; F; 0-. S-. or N-alkyI; 0-, S-. or N-alkenyl; 0-. S- or N-alkynyl; or 
30 O-alkyl-O-alkyl. wherein the alkyI, alkenyl and alkynyl may be substituted or 
unsubstituted Ci to C10 alkyI or C2 to C10 alkenyl and alkynyl. Particulariy 
prefered are 0[(CH2)n01mGH3. 0(CH2)nOCH3, 0(CH2)nNH2. 0(CH2)nCH3. 
0(CH2)nONH2. and 0(CH2)n0N[(CH2)„CH3l2. where n and m are from 1 to about 
1 0. Other oligonucleotides comprise one of the following at the 2* position: Ci to 
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Cio lower alkyi, substituted lower alkyl, alkenyl, alkynyl, alkaryl, aralkyi, O-alkaryl 
or O-aralkyl. SH. SCH3. OCN. CI, Br, CN. CF3. OCF3. SOCH3. SO2CH3. ONO2. 
NO2, N3. NH2. heterocycloalkyl. heterocycloalkaryl, aminoalkylamino, 
polyalkylamino, substituted silyl. an RNA cleaving group, a reporter group, an 
5 intercalator, a group for improving pharmacokinetic properties and other 
substituents having similar properties. Another modification includes 2 - 
methoxyethoxy (2'-0-CH2CH20CH3, also known as 2 -0-(2-methoxyethyl) or T- 
MOE) (Martin et al. (1995) Helv. Chim, Acta. , 78, 486-504) i.e., an alkoxyalkoxy 
group. A further preferred modification includes Z-dimethylaminooxyethoxy, i.e., 

10 a 0(CH2)20N(CH3)2 group, also known as 2-DMAOE, and 2- 

dimethylaminoethoxyethoxy (also known in the art as 2'-0-dimethyl-amino- 
ethoxy-ethyl or 2'-DMAEOE), i.e.. 2'-0-CH2-0-CH2-N(CH3)2. 

Other modifications include 2'-methoxy (2'-0-CH3), 2'-aminopropoxy (2 - 
OCH2CH2CH2NH2). 2'-allyl (2'-CHr-CH=CH2). 2'-0-allyl (2*-0-CH2-CH=CH2) 

15 and 2'-fluoro (2*-F). The 2*-modification may be in the arabino (up) position or 
ribo (down) position. An exemplary 2'-arabino modification is 2*-F. Similar 
modifications may also be made at other positions on the oligonucleotide, 
particularly the 3* position of the sugar on the 3* terminal nucleotide or in 2-5* 
linked oligonucleotides and the 5' position of 5' terminal nucleotide. 

20 Oligonucleotides may also have sugar mimetics such as cyclobutyl moieties in 
place of the pentofuranosyl sugar. 

A further modification includes Locked Nucleic Acids (LNAs) in which the 
2'-hydroxyl group is linked to the 3' or 4" carbon atom of the sugar ring thereby 
forming a bicyclic sugar moiety. The linkage is preferably a methelyne (-CH2-)n 

25 group bridging the 2' oxygen atom and the 4* carbon atom wherein n is 1 or 2. 
LNAs and preparation thereof are described in U.S. Patent No. 6,268.490 and 
WO 99/14226. 

Oligonucleotides may also include nudeobase (often referred to in the art 
simply as "base*") modifications or substitutions. As used herein, •^unmodified" or 
30 "natural" nucleobases include the purine bases adenine (A) and guanine (G), 
and the pyrimidine basies thymine (T), cytosine (C) and uracil (U), Modified 
nucleobases include other synthetic and natural nucleobases such as 5- 
methylcytostne (5-me-C), 5-hydroxymethyi cytosine, xanthine, hypoxanthine, 2-. 
aminoadenine, 6*methyl and other alkyi derivatives of adenine and guanine, 2- 
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propyl and other alkyi derivatives of adenine and guanine, 2-thiouracil, 2- 
thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl uracil and 
cylosine and other alkynyl derivatives of pyrimidine bases, 6-azo uracil, cytosine 
and thymine, 5-uracil (pseudouracil), 4-thiouracil. 8-halo. 8-amino. 8-thiol, 8- 

5 thioalkyU 8-hydroxyl and other 8-substituted adenines and guanines, 5-halo 
particularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracils and 
cytosines. 7-methylguanine and 7-methyladenine, 2-F-adenine, 2-amino- 
adenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and 7-deazaadenlne 
and 3-deazaguanine and 3-deazaadenine. Further modified nucleobases include 

10 tricyclic pyrimidines such as phenoxazine cytidine(1H-pyrimido[5,4- 

b][1 ,4]benzoxazin-2(3H)-one), phenothiazine cytidine (1H-pyrimido[5,4- 
b][1 .4]benzothiazin-2(3H)-one), G-clamps such as a substituted phenoxazine 
cytidine (e.g., 9-(2-aminoethoxy)-H-pyrimido[5,4-b][1 ,41benzoxazin-2(3H)-one), 
carbazole cytidine (2H-pyrimido[4,5-blindol-2-one). pyridoindole cytidine (H- 

15 pyrido[3\2':4,5]pyrrolo[2,3-d]pyrimidin-2-one). Modified nucleobases may also 
include those in which the purine or pyrimidine base is replaced with other 
heterocycles, for example 7-deaza-adenine, 7-deazaguanosine, 2-amlnopyridine 
and 2-pyridone. Further nucleobases include those disclosed in U.S. Pat. No. 
3,687,808. those disclosed in The Concise Encyclopedia of Polymer Science 

20 and Engineering, pages 858-859, Kroschwitz. J. I., ed. John Wiley & Sons, 1990, 
those disclosed by Englisch et al., Angewandte Chemie. International Edition, 
1991, 30. 613, and those disclosed by Sanghvi. Y. S., Chapter 15, Antisense 
Research and Applications, pages 289-302. Crooke, S. T. and Lebleu. B., ed.. 
CRC Press, 1993. Certain of these nucleobases may be particulariy useful for 

25 increasing the binding affinity of the oligomeric compounds of the disclosure. 

These include 5-substituted pyrimidines, 6-azapyrimidines and N-2, N-6 and 0-6 
substituted purines, including 2-aminopropyladenine, 5-propynyluracil and 5- 
propynylcytosine. 5-methylcytosine substitutions have been shown to increase 
nucleic acid duplex stability by 0.6-1 .2.degree. C, (Sanghvi, Y. S., Crooke. S. T. 

30 and Lebleu, B., eds., Antisense Research and Applications. CRC Press, Boca 
Raton, 1993, pp- 276-278) and are presently preferred base substitutions, even 
more particulariy when combined with 2 -Omethoxyethyl sugar modifications. 
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The terms "including", "such as", "for example" and the like are intended 
to refer to exemplary embodiments and not to limit the scope of the present 
disclosure. 

The term "polypeptides" includes proteins and fragments thereof. 
5 Polypeptides are disclosed herein as amino acid residue sequences. Those 
sequences are written left to right in the direction from the amino to the carboxy 
terminus. In accordance with standard nomenclature, amino acid residue 
sequences are denominated by either a three letter or a single letter code as 
indicated as follows: Alanine (Ala. A), Arginine (Arg, R), Asparagine (Asn, N), 
10 Aspartic Acid (Asp, D), Cysteine (Cys, C), Glutamine (Gin, Q). Glutamic Acid 
(Glu, E), Glycine (Gly, G), Histidine (His, H), Isoieucine (lie, I), Leucine (Leu, L), 
Lysine (Lys, K), Methionine (Met, M), Phenylalanine (Phe, F), Proline (Pro. P), 
Serine (Ser, S), Threonine (Thr, T), Tryptophan (Trp, W), Tyrosine (Tyr, Y), and 
Valine (Val. V). 

1 5 "Variant" refers to a polypeptide or polynucleotide that differs from a 

reference polypeptide or polynucleotide, but retains essential properties. A 
typical variant of a polypeptide differs in amino acid sequence from another, 
reference polypeptide. Generally, differences are limited so that the sequences 
of the reference polypeptide and the variant are closely similar overall and, in 

20 many regions, identical. A variant and reference polypeptide may differ in amino 
acid sequence by one or more modifications (e.g., substitutions, additions, 
and/or deletions). A substituted or inserted amino acid residue may or may not 
be one encoded by the genetic code. A variant of a polypeptide may be 
naturally occurring such as an allelic variant, or it may be a variant that is not 

25 known to occur naturally. 

Modifications and changes can be made in the structure of the 
polypeptides of in disclosure and still obtain a molecule having similar 
characteristics as the polypeptide (e.g., a conservative amino acid substitution). 
For example, certain amino acids can be substituted for other amino acids in a 

30 sequence without appreciable loss of activity. Because it is the interactive 
capacity and nature of a polypeptide that defines that polypeptide's biological 
functional activity, certain amino acid sequence substitutior^s can t>e made in a 
polypeptide sequence and nevertheless obtain a polypeptide with like properties. 
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In making such changes, the hydropathic index of amino acids can be 
considered. The importance of the hydropathic amino acid index in conferring 
interactive biologic function on a polypeptide is generally understood in the art. It 
is known that certain amino acids can be substituted for other amino acids 
5 having a similar hydropathic index or score and still result in a polypeptide with 
similar biological activity. Each amino acid has been assigned a hydropathic 
index on the basis of its hydrophobicity and charge characteristics. Those 
indices are: isoleucine (+4.5); valine (+4.2); leucine (+3.8); phenylalanine (+2.8); 
cysteine/cysteine (+2.5); methionine (+1 .9); alanine (+1.8); glycine (-0.4); 
1 0 threonine (-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1 .3); proline (-1 .6); 
histidine (-3.2); glutamate (-3.5); glutamine (-3.5); aspartate (-3.5); asparagine (- 
3.5); lysine (-3.9); and arginine (-4.5). 

It is believed that the relative hydropathic character of the amino acid 
determines the secondary structure of the resultant polypeptide, which in turn 
1 5 defines the interaction of the polypeptide with other molecules, such as 

enzymes, substrates, receptors, antibodies, antigens, and the like. It is known in 
the art that an amino acid can be substituted by another amino acid having a 
similar hydropathic index and still obtain a functionally equivalent polypeptide. In 
such changes, the substitution of amino acids whose hydropathk; indices are 
20 within ± 2 is preferred, those within + 1 are particularly preferred, and those 
within ± 0.5 are even more particularly preferred. 

Substitution of like amino acids can also be made on the basis of 
hydrophilicity. particularly, where the biological functional equivalent polypeptide 
or peptide thereby created is intended for use in immunological embodiments. 
25 The following hydrophilicity values have been assigned to amino acid residues: 
arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate (+3.0 ± 1); serine 
(+0-3); asparagine (+0.2); glutamnine (+0.2); glycine (0); proline (-0.5 ± 1); 
threonine (-0.4); alanine (-0.6); histidine (-0.5); cysteine (-1.0); methionine (-1.3); 
valine (-1 .5); leucine (-1 .8); Isoleucine (-1 .8); tyrosine (-2.3); phenylalanine (-2.5); 
30 tryptophan (-3.4). It is understood that an amino acid can be substituted for 
another having a similar hydrophilicity value and still obtain a biologically 
equivalent, and in particular, an immunologically equivalent polypeptide. In such 
changes, the substitution of amino adds whose hydrophilicity values are within ± 
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2 is preferred, those within ± 1 are particularly preferred, and those within ± 0.5 
are even more particularly preferred. 

As outlined above, amino acid substitutions are generally based on the 
relative similarity of the amino acid side-chain substituents. for example, their 
5 hydrophobicity, hydrophilicity, charge, size, and the like. Exemplary substitutions 
that take various of the foregoing characteristics into consideration are well 
known to those of skill in the art and include (original residue: exemplary 
substitution): (Ala: Gly. Ser), (Arg: Lys), (Asn: Gin, His). (Asp: Glu. Cys, Ser). 
(Gin: Asn), (Glu: Asp), (Gly: Ala). (His: Asn, Gin), (lie: Leu, Val), (Leu: He. Val). 

10 (Lys: Arg). (Met: Leu. Tyr). (Sen Thr). (Thn Ser), (Tip: Tyr), (Tyr: Trp, Phe). and 
(Val: He. Leu). Embodiments of this disclosure thus contemplate functional or 
biological equivalents of a polypeptide as set forth above. In particular, 
embodiments of the polypeptides can include variants having about 50%, 60%, 
70%, 80%, 90%, and 95% sequence identity to the polypeptide of interest 

15 "Identity," as known in the art, is a relationship between two or more 

polypeptide sequences, as determined by comparing the sequences. In the art, 
"identity" also means the degree of sequence relatedness between polypeptide 
as determined by the match between strings of such sequences. "Identity" and 
"similarity" can be readily calculated by known methods, including, but not limited 

20 to, those described in (Computational Molecular Biology, Lesk, A. M.. Ed.. 
Oxford University Press, New York, 1988; Biocomputing: Informatics and 
Genome Projects, Smith, D. W., Ed.. Academic Press, New York, 1993; 
Computer Analysis of Sequence Data, Part I, Griffin. A. M.. and Griffin. H. G.. 
Eds., Humana Press. New Jersey. 1994; Sequence Analysis in Molecular 

25 Biology, von Heinje, G., Academic Press. 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux. J., Eds., M Stockton Press, New York, 1991; and 
Carillo. H., and Lipman, D.. SIAM J Applied Math.. 48: 1073 (1988). 

Preferred methods to determine identity are designed to give the largest 
match between the sequences tested. Methods to determine identity and 

30 similarity are codified in publicly available computer programs. The percent 
identity between two sequences can be determined by using analysis software 
(I.e., Sequence Analysis Software Package of the Genetics Computer Group, 
Madison Wis.) that incorporates the Needelman and Wunsch. (J. Mol. Biol.. 48: 
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443-453. 1970) algorithm (e.g.. NBLAST. and XBLAST). The default parameters 
are used to determine the identity for the polypeptides of the present invention. 

By way of example, a polypeptide sequence may be identical to the 
reference sequence, that is be 100% identical, or it may include up to a certain 
5 integer number of amino acid alterations as compared to the reference sequence 
such that the % identity is less than 100%. Such alterations are selected from: 
at least one amino acid deletion, substitution, including conservative and non- 
conservative substitution, or insertion, and wherein said alterations may occur at 
the amino- or carboxy-terminal positions of the reference polypeptide sequence 
10 or anywhere between those terminal positions, interspersed either individually 
among the amino acids in the reference sequence or in one or more contiguous 
groups within the reference sequence. The number of amino acid alterations for 
a given % identity is determined by multiplying the total number of amino acids in 
the reference polypeptide by the numerical percent of the respective percent 
1 5 identity (divided by 1 GO) and then subtracting that product from said total number 
of amino acids in the reference polypeptide. 

"Operably linked" refers to a juxtaposition wherein the components are 
configured so as to perform their usual function. For example, control 
sequences or promoters operably linked to a coding sequence are capable of 
20 effecting the expression of the coding sequence. 

As used herein, the term "transfection" refers to the introduction of a 
nucleic acid sequence into the interior of a membrane enclosed space of a living 
cell, including introduction of the nucleic acid sequence into the cytosol of a cell 
as well as the interior space of a mitochondria, nucleus or chloroplast. The 
25 nucleic acid may be in the form of naked DNA or RNA. associated with various 
proteins or the nucleic add may be incorporated into a vector. 

As used herein, the tenn "vector is used in reference to a vehicle used to 
introduce a nucleic acid sequence into a cell, A viral vector is virus that has 
been modified to allow recombinant DNA sequences to be introduced into host 
30 cells or cell organelles. 

The tenn "selective agent" refers to a substance tiiat is required for 
growth or for preventing growth of a cell or microorganism, for example cells or 
microorganisms that have been engineered to require a specific substance for 
growth or inhibit or reduce growth in the absence of a complementing factor. 
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Exemplary complementing factors include enzymes that degrade the selective 
agent, or enzymes that produce a selective agent. Generally, selective agents 
include, but are not limited to amino acids, antibiotics, nucleic acids, minerals, 
nutrients, etc. Selective media generally refers to culture media deficient in at 
5 least one substance, for example a selective agent, required for growth. The 
addition of a selective agent to selective media results in media sufficient for 
growth. 

As used herein, the term "coregulator" refers to a transcription modulator. 
It should be emphasized that the above-described embodiments of the 
10 present disclosure, particulariy, any "preferred" embodiments, are merely 

possible examples of implementations, merely set forth for a clear understanding 
of the principles of the disclosed subject matter Many variations and 
modifications may be made to the above-described embodiment(s) without 
departing substantially from the spirit and principles of the disclosure. All such 
1 5 modifications and variations are intended to be included herein within the scope 
of this disclosure and protected by the following claims. 
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Table 3" Activation of Transcription by Wild-Type RXR and Variant Receptors 
in CV-1 Cells and in Yeast Strain PJ69-4a with pGAD-ACTR Plasmid and 
pGBDRXR-Mutant Plasmids. 




^^^^^L^A^^1^9t^ct^tm «cowtioo by ligand; -m* 50-70% of wiW-typc 

pOBT^Cal 4 acliv«l<» by Ugmd (esOBMted imcertaioQ^ 
— = no growth 



Table 3 
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Table 4: Nine of nine sequences randomly chosen from the unselected library ail contain 
background sequences, not the designed sequences. Twelve of twelve sequences 
randomly chosen from the selected library (indicating that they are functional) atl contain 
designed sequences. This result demonstrates that chemical complementation is very 
efficient at identifying functional receptors within a large collection of nonfunctional 
receptors. 
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5 1 . A method for identifying receptors, comprising: 

(a) introducing a first polynucleotide encoding a receptor in to a 
cell, wherein the receptor comprises a ligand binding domain for a target ligand 
operably linked to a polynucleotide binding domain so that binding of the target 
ligand to the receptor activates transcription of a second polynucleotide 

10 complementing a selection agent; 

(b) culturing the cell on the selective media in the presence of 
the target ligand. wherein growth of the cell indicates interaction of the receptor 
with the target ligand. 

2. The method of claim 1 , further comprising culturing the cell on 

1 5 selective media in the absence of the target ligand, wherein growth of the cell 
indicates the receptor oonstitutively activates transcription of the second 
polynucleotide. 

3. A cell comprising: 

(a) a recombinant nuclear receptor that induces expression of a 
20 first polynucleotide in response to interaction with a target substance, wherein 

expression of the first polynucleotide complements a selective agent; and 

(b) an adapter fusion protein comprising a human coregulator 
domain operably linked to an activation domain, wherein the adapter fusion 
protein enhances transcription of the first polynucleotide induced by the 

25 recombinant nuclear receptor. 

4. The cell of claim 3, wherein the cell is eukaryotic or prokaryotic. 

5. The cell of claim 3, wherein the cell is a yeast cell. 

6. A method for identifying enzymes comprising: 

(a) introducing a first polynucleotide into a cell that is unable to 
30 grow on selective media, wherein the cell expresses a recombinant receptor 

polypeptkle that activates transcription of a second polynucleotide in response to 
Interaction of the recombinant receptor polypeptide with a target substance; 

(b) culturing the cell on the selective media; and 

(c) selecting the cell ttiat grows on the selective media. 
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7. The method of claim 6. wherein the target substance is produced 
by a polypeptide encoded by the first polynucleotide. 

8. The method of claim 6, wherein a single target substance induces 
a conformational change in the recombinant receptor to activate transcription. 

5 9. The method of claim 6. wherein the target substance is unmodified. 

10. The method of claim 6, wherein growth on the selective media 
indicates the first polynucleotide encodes a product that complements the 
selective media. 

1 1 . The method of claim 6, wherein the cell is a eukaryotic or 
10 prokaryotic cell. 

12. The method of claim 6, wherein the selective media does not 
contain an amino acid necessary for survival. 

13. The method of claim 12, wherein the amino acid is selected from 
the group consisting of histidine and alanine. 

15 14. The method of claim 6. wherein the first polynucleotide encodes an 

enzyme that produces the target substance. 

1 5. The method of claim 6, wherein the transformed cell further 
expresses an adaptor fusion protein. 

16. The method of claim 7, wherein the adaptor fusion protein 16. 

20 1 7. The method of claim 6,wherein the first polynucleotide encodes an 

engineered enzyme. 

18. The method of claim 6,wherein the first polynucleotide encodes a 
naturally occurring enzyme. 

19. The method of claim 6, comprises a human coactivator for 
25 transcription of the second polynucleotide. 

20. A cell comprising: 

(a) a recombinant nuclear receptor that induces transcription of 
a first polynucleotide in response to interaction with a target substance; and 

(b) an adapter fusion protein comprising a human coactivator 
30 domain operably linked to an activation domain, wherein the adapter fusion 

protein enhances transcription of the .first polynucleotide induced by the 
recombinant nuclear receptor. 

21 . The cell of claim 20, wherein the human coactivator domain is 
selected from the group consisting of SRC-1 and ACTR. 
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22. The cell of claim 20, wherein the cell is unable to grow on selective 

media. 

23. A method for selecting cells comprising: 

(a) introducing a first polynucleotide into a cell, wherein the cell 
5 expresses a recombinant receptor polypeptide that activates transcription 

of a second polynucleotide in response to interaction of the recombinant 
receptor polypeptide with a target substance; 

(b) culturing the cell on selective media in the presence of a first 
selection agent; and 

-JO (c) selecting the cell that survives on the selective media in the 

presence of the selection agent, wherein expression of the second 
polynucleotide inhibits growth of the cell. 

24. The method of claim 23, wherein the second polynucleotide 
encodes a cytotoxic polypeptide. 

1 5 25. The method of claim 24. wherein the cytotoxic polypeptide 

comprises a proapoptotic polypeptide. 

26. The method of claim 21 . wherein the first selective agent 
comprises 5-fluoroorotic acid. 

27. The method of claim 26. wherein the second polynucleotide 
20 encodes orotidine-5'-phosphate decarboxylase. 

28. The method of claim 27, wherein the toxic substance comprises 5- 

fluorouracil. 

29. A method for assembling an en2ymatic pathway comprising: 

(a) introducing a plurality of polynucleotides encoding enzymes 
25 having different substrates into a cell that is unable to grow on selective media. 

wherein the cell expresses a recombinant receptor polypeptide that activates 
transcription of a second polynucleotide in response to interaction of the 
recombinant receptor polypeptide with a target substance; 

(b) culturing the cell on the selective media; and 

30 (c) selecting the cell that grows on the selective media, wherein 

growth of a cell on the selective media indicates that the plurality of 
polynucleotides encode enzymes for producing products that complement the 
selective media. 
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30. The method of claim 31 , wherein the product of one of the 
enzymes is the substrate of another of the enzymes. 

31 . A method for identifying receptors, comprising: 

(a) introducing a first polynucleotide encoding a receptor in to a 
5 ceil, wherein the receptor comprises a ligand binding domain for a target ligand 

operably linked to a response element so that binding of the target ligand to the 
receptor activates transcription of a second polynucleotide complementing a 
selection agent; 

(b) culturing the cell on the selective media in the presence of 
10 the target ligand, wherein growth of the cell indicates interaction of the receptor 

with the target ligand. 
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<110> Georgia Tech Research" Corporation 
Donald, Doyle F. 
Bahareh, Azizi 
Lauren, Schwimmer J. 

<120> ENGINEERING ENZYMES THROUGH GENETIC SELECTION 

<130> 820701-2810 

<150> 60/520,754 
<151> 2003-11-17 

<150> 60/520,813 
<151> 2003-11-17 

<160> 18 

<170> Patentin version 3.3 

<210> 1 

<211> 17 

<212> DNA 

<213> artificial sequence 
<220> 

<223> Cassette 1, F 

<400> 1 

cggaatttcc catgggc 
17 



<210> 2 

<211> 37 

<212> DNA 

<213> artificial sequence 
<220> 

<223> BPf 

<400> 2 

ctcgccgaac gacccggtca ccgcatgcca ctagtgg 
37 



<210> 3 

<211> 36 

<212> DNA 

<213> artificial sequence 

<220> 

<223> BPr 

<400> 3 

ccgcttggcc cactccacta gtggcatgcg gtgacc 
36 



<210> 4 
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<211> 37 
<212> DNA 

<213> artificial sequence 



<220> 

<223> cassette 2, BPf, BPr, SEf 
<400> 4 

cgggcaggct ggaatgagct cctcgacgga attctcc 
37 



<210> 5 

<211> 36 

<212> DNA 

<213> artificial sequence 
<220> 

<223> SEr * 

<400> 5 

cagcccggtg gccaggagaa ttccgtcgag gagctc 
36 



<210> 6 

<211> 40 

<212> DNA 

<213> artificial sequence 



<220> 

<223> cassette 3, SEf, SEr, AMf 



<400> 6 

ctctgcgctc catcgggctt aagtgcccac caattgacac 
40 



<210> 7 

<211> 46 

<212> DNA 

<213> artificial sequence 



<220> 

<223> AMr 
<400> 7 

ctccagcatc tccataagga aggtgtcaat tggtgggcac ttaagc 
46 



<210> 8 

<211> 17 

<212> DNA 

<213> artificial secjuence 
<220> 

<223> cassette 4, AMf, AMr, and R 

<400> 8 
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caaaggacgg gccgcag 
17 



<210> 9 

<211> 46 

<212> DNA 

<213> artificial sequence 
<220> 

<223> oligos BPl 

<400> 9 

ggcaaacatg gggctgaacc ccagctcgcc gaacgacccg gtcacc 
46 



<210> 10 

<211> 66 

<212> DNA 

<213> artificial sequence 
<220> 

<223> BP2 



<220> 

<221> misc_feature 

<222> (33) . . (33) 

<^23> A, C or T 

<220> 

<221> misc_f eature 

<222> (34) . . (34) 

<223> A or G 

<220> 

<221> misc_feature 

<222> (35) . . (35) 

<223> C or G 

<220> 

<221> miscfeature 

<222> (36) . . (36) 

<223> A, C or T 

<220> 

<221> misc_f eature 

<222> (37) . . (37) 

<223> A or G 

<220> 

<221> misc_f eature 

<222> (38) . . (38) 

<223> C or G 

<220> 

<221> misc_f eature 

<222> (45) . . (45) 

<223> A, C or T 
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<220> 

<221> cnisc^f eature 

<222> (46) . . (46) 

<223> A or G 



<220> 

<221> misc_feature 

<222> (47) . . (47) 

<223> C or G 



<400> 10 

gcccactcca ctagtgtgaa aagctgtttg tcnnnnnntt ggcanngttg gtgaccgggt 
€0 

cgttcg 
66 



<2ao> 11 

<211> 48 

<212> DMA 

<213> artificial sequence 



<220> 

<223> BP3 



<400> 11 

cttttcacac tagtggagtg ggccaagcgg atcccacact tctcagag 
48 



<210> 12 

<211> 28 

<212> DMA 

<213> artificial 



sequence 



<220> 
<223> BP4 



<400> 12 

ggggcagctc tgagaagtgt gggatccg 
28 



<210> 13 

<211> 48 

<212> DMA 

<213> artificial seq^^ence 
<220> 

<223> SEl 



<220> 

<221> misc_feature 

<222> (22) . . (22) 

<223> A, G or T 

<220> 

<221> misc_f eature 

<222> (23)., (23) 
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<223> C or T 
<220> 

<221> misc_Eeature 

<222> (24) . - (24) 

<223> G or C 

<220> 

<221> misc_feature 

<222> (31).. (31) 

<223> A, G or T 

<220> 

<221> misc_f eature 

<222> (32) . . (32) 

<223> C or T 

<220> 

<221> misc_f eature 

<222> (33) - . (33) 

<223> G or C 

<400> 13 

gcaggctgga atgagctcct cnnngcctcc nnntcccacc gctccatc 
48 



<210> 14 

<211> 46 

<212> DNA 

<213> artificial seqpience 
<220> 

<223> SE2 

<400> 14 

ccggtggcca ggagaattcc gtccttcacg gcgatggagc ggtggg 
46 



<210> 15 

<211> 63 

<212> DNA 

<213> artificial sec[uence 
<220> 

<223> AMI 



<220> 

<221> misc_feature 

<222> (38) . . (38) 

<223> A, G or T 

<220> 

<221> misc_featur-e 

<222> (39).. (39) 

<223> C or T 

<220> 

<221> misc feature 
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<222> (40) . . (40) 
<223> G or C 

<400> 15 

ggctctgcgc tccatcgggc ttaagtgcct ggaacatnnn ttscttcttc aagctcatcg 
60 

ggg 

€3 



<210> 16 

<211> 51 

<212> DMA 

<213> artificial sequence 
<220> 

<223> AM2 

<400> 16 

gcatctcaat aaggaaggtg tcaattgtgt gtccccgatg agcttgaaga a 
51 



<210> 17 
<21X> 12414 
<212> DNA 

<213> artificial sequence 
<220> 

<223> pGADlOBA seq 020803 
<400> 17 

gcttgcatgc aacttctttt cttttttttt cttttctctc tcccccgttg ttgtctcacc 
60 

atatccgcaa tgacaaaaaa aatgatggaa gacactaaag gaaaaaatta acgacaaaga 
120 

cagcaccaac agatgtcgtt gttccagagc tgatgagggg tatcttcgaa cacacgaaac 
180 

tttttccttc cttcattcac gcacactact ctctaatgag caacggtata cggccttcct 
240 

tccagttact tgaatttgaa ataaaaaaag tttgccgctt tgctatcaag tataaataga 
300 

cctgcaatta ttaatctttt gtttcctcgt cattgttctc gttccctttc ttccttgttt 
360 

ctttttctgc acaatatttc aagctatacc aagcatacaa tcaactccaa gctttgcaaa 
420 

gatggataaa gcggaattaa ttcccgagcc tccaaaaaag aagagaaagg tcgaattggg 
480 

taccgccgcc aattttaatc aaagtgggaa tattgctgat agctcattgt ccttcacttt 
540 
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cactaacagt agcaacggtc cgaacctcat 
600 

acaaccaatt gcctcctcta acgttcatga 
660 

taaaattgat gatggtaata attrcaaaacc 

720 

gtataacgcg tttggaatca ctacagggat 
780 

taactatcta ttcgatgatg aagatacccc 
84 0 

tggattagga gaaaacttgg atccactggc 
900 

tgatactcca ggacaaggtc ttacctgcag 
960 

taaaCatatt: gaagaattgg ctgagctgat 
1020 

caatgtcaaa ccagataaat gtgcgatttt 
1080 

aaaagagcaa ggaaaaacta tttccaatga 
114 0 

tacagggcag ggagttattg ataaagactic 
1200 

tggtttccta tttgtggtga atcgagacgg 
1260 

acaatacctg caatataagc aagaggacct 
1320 

tgaagaagac agaaaggat^t ttcttaagaa 
1380 

ctggacaaat gagacccaaa gacaaaaaag 
1440 

aacaccacat gatattctgg aagacataaa 
1500 

aacaatgcag tgctttgccc tgtctcagcc 
1560 

gcaatcttgt atgatctgtg tggcacgccg 
1620 

aaaccctgag agctttatta ccagacatga 
1680 

aaattcacCg agatcctcca tgaggcctgg 
1740 



aacaactcaa acaaattctc aagcgctctc 
taacttcatg aataatgaaa tcacggctag 
actgtcacct ggttggacgg accaaactgc 
gtttaatacc actacaatgg atgatgtata 
accaaaccca aaaaaagaga tctttatgag 
cagtgattca cgaaaacgca aattgccatg 
tggtgaaaaa cggagacggg agcaggaaag 
atctgccaat cttagtgata ttgacaattt 
aaaggaaaca gtaagacaga tacgtcaaat: 
tgatgatgtt caaaaagccg atgtatcttc 
cttaggaccg cttttacttc aggcattgga 
aaacattgCa tttgtatcag aaaatgtcac 
ggttaacaca agtgtttaca atatcttaca 
ttcaccaaaa tctacagtta atggagtttc 
ccatacattt aattgccgta tgttgatgaa 
cgccagtcct gaaatgcgcc agagatatga 
acgagctatg atggaggaag gggaagattt 
catitactaca ggagaaagaa catttccatc 
tctttcagga aaggttgtca atatagatac 
ctttgaagat ataatccgaa ggtgtattca 
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gagatttttt agtctaaatg 
1800 

taccagtgat gggatatttt 
1860 

atatcgattc tcgttggctg 
1920 

ccgaaatcct gtaacaaatig 
1980 

acagaatgga taCagaccaa 
2040 

tggatgcaac agttcggtag 
2100 

gagcagcagg gcctatggct 
2160 

gtatgggggt tccagCaaca 
2220 

titcctaccag aacaacaact 
2280 

tcttgcccca aaccagcaga 
2340 

agcctcacat cagttttctc 
2400 

tactgggaac cacagctttt 
2460 

tgtggggact tcccttttat 
2520 

caatatgaat: attacccaac 
2580 

cttttattgc gaccaaaatc 
2640 

cctcagtgac aaagaaagta 
2700 

tctggaaagc aaaggCcata 
2760 

gggtcattcc tccttgacca 
2820 

tgtcaccagc ccctctggag 
2880 

tatigcatggg tcactgtitac 
2940 



atgggcagtc atggtcccag 
ccccaacagc ttatcttaat 
atggaactat agtgactgca 
atcgacatgg ctttgtctca 
acccaaatcc tgttggacaa 
gcggcatgag tatgtcgcca 
tggcagaccc tiagcaccaca 
tagcttcatt gacccctggg 
ataggctcaa catgagtagc 
atatcatgat ttctcctcgt 
ctgttgcagg tgtgcactct 
ccagcagctc tctcagtgcc 
ctactctgtc atcaccaggc 
caagtaaagt aagcaatcag 
cagCggagag ttcaatgtgt 
aggagagcag tgttgagggg 
aaaaattact gcagttactt 
actcccccct agatitcaagt 
tctcctcctc tacatctgga 
aagagaagca ccggattttg 



aaacgtcact atcaagaagt 
ggccatgcag aaaccccagt 
cagacaaaaa gcaaactctt 
acccacttcc ttcagagaga 
gggattagac cacctatggc 
aaccaaggct tacagatgcc 
gggcagatga gtggagctag 
ccaggcatgc aatcaccatc 
cccccacatg ggagtcctgg 
aatcgtggga gtccaaagat 
cccatggcat cttctggcaa 
ctgcaagcca tcagtgaagg 
cccaaattgg ataactictcc 
gattccaaga gtcctctggg 
cagtcaaata gcagagatca 
gcagagaatc aaaggggtcc 
acctgttctt ctgatgaccg 
tgtaaagaa^ cttctgttag 
ggagtatcct ctacatccaa 
cacaagctgc tgcagaatgg 
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gaattcacca gctgaggtag ccaagatCac 
3000 

tataacctct tgtggggacg gaaatgtCgt 
3060 

ggagaacaat gcacttctta gatacctgct 

3120 

taaagaacta cagccccaag tggaaggagt 
3180 

caccattcct agctcaagtc aagagaaaga 
3240 

gggatctgga gacttggata acctagatgc 
3300 

ttacaataat tccatatcct caaatggtag 
3360 

aggaactaat tctctgggtt tgaaaagttc 
3420 

taaccgagca gtgtctctgg atagccctgt 
3480 

tatcagtgct ttccccatgt taccaaagca 
3540 

ggat:agtcag gaaaattatg gctcaagtat 
360O 

cggcagaatg gaacctatga attcaaactc 
3660 

ttctttaccc agacctgcac tgggtggctc 
3720 

cataccaggt gcgagaccag tattgcaaca 
3780 

tgaaatcccc atgggaatgg gggctaatcc 
3840 

gggttcctgg cccgatggca tgttgtccat 
3900 

gcctcttctt aggaattccc tggatgatct 
3960 

gagtgacgaa agagcatztat: tggaccagct 
4020 

aggcctggaa gaaattgaca gagctttggg 
4080 

attagagccc aaacaggatg ctttccaagg 
4140 
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tgcagaagcc actgggaaag acaccagcag 

caagcaggag cagctaagtc ctaagaagaa 

ggacagggat gatcctagtg atgcactctc 

ggataataaa atgagtcagC gcaccagctc 

ccctaaaatt aagacagaga caagtgaaga 

tattcttggt gatctgacta gttctgactt 

tcatctgggg actaagcaac aggtgtttca 

acagtctgtg cagtctattc gtcctccata 

ttctgttggc tcaagtcctc cagtaaaaaa 

acccatgttg ggtgggaatc caagaatgat 

gggagactgg ggcttaccaa actcaaaggc 

catgggaaga ccaggaggag attataatac 

tattcccaca ttgcctcttc ggtctaatag 

gcagcagcag atgcttcaaa tgaggcctgg 

ctatggccaa gcagcagcat: ctaaccaact 

ggaacaagct tctcatggca ctcaaaatag 

tgttgggcca ccttccaacc tggaaggcca 

gcacactctt ctcagcaaca cagatgccac 

cattcctgaa cttgtcaatc agggacaggc 

ccaagaagca gcagtaatga tggatcagaa 
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ggcaggatta tatggacaga catacccagc 
4200 

tcttcaggga caatcaccat cttttaactc 
4260 

ttttcctctc caaggaatgc acccacgagc 
4320 

caagcaactt agaatgcagc ttcagcagag 
4380 

ccgacaggca cttgaattga aaat:ggaaaa 
4440 

gcctatgatg cagccccagc agggttttct 
4500 

agagctgcta agtcatcact tccgacaaca 
4560 

acagcagcag cagcagcagc agcagcagca 
4620 

gcagcaaacc caggccttca gcccacctcc 
4680 

gcttttggca ggacccacaa tgccacaagc 
4740 

ttatggaatg ggacaacaac cagatccagc 
4800 

aatgatgtcg tcaagaatgg gtccctccca 
4860 

atccatctat cagtcctcag aaatgaaggg 
4920 

ctccttttcc cagcagcagt ttgcccacca 
4980 

catgaatggc agcagtggtc acatgggaca 
5040 

catgcctatg ggtcctgatc agaaatactg 
5100 

aaaccactgt acaaatgaca ctgcactagg 
5160. 

tccatcttgg aagaaaggac cagctttgag 
5220 

titigagcagga ctggatttta agccgaaggg 
5280 

tgctgtgtat catggtgttc aaaacagaaa 
5340 
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acaggggcct ccaatgcaag gaggctttca 

tatgatgaat cagatgaacc agcaaggcaa 
caacatcatg agaccccgga caaacacccc 
gctgcagggc- cagcagtttt Cgaatcagag 
ccctactgct ggtggtgctg cggtgatgag 
taatgctcaa atggtcgccc aacgcagcag 
9^999tggct atgatgatgc agcagcagca 
acagcaacag caacagcaac agcagcaaca 
taatgtgact gcttccccca gcatggatgg 
tcctccgcaa cagtttccat atcaaccaaa 
ctttggtcga gtgtctagtc ctcccaatgc 
gaatcccatg atgcaacacc cgcaggctgc 
ctggccatca ggaaatttgg ccaggaacag 
ggggaatcct gcagtgtata gtatggtgca 
gatgaacatg aaccccatgc ccatgtctgg 
ctgacatctc tgcaccagga cctcttaagg 
attattggga aggaatcatzt gttccaggca 
ctccatcaag ggtattttaa gtgatgtcat 
caatatctac gtgtttttcc cccctccttc 
tgttttttgg cattccacct cctagggata 
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tiaattctgga gacatggagt 
5400 

tgctagccaa aatctcttaa 
5460 

gagattagaa catctggttt 
5520 

tcaggtgtag tagttctgtg 
5580 

ctttactaat ggtgttgagt 
5640 

aaggttttca tttgccattc 
5700 

tccaaatggc tttgcagaaa 
5760 

ttttcacatg ctaatgtgca 
5820 

attcttgagg tcttgaggga 
5880 

ttagacaaga actatgattt 
5940 

agagcaataa tgctttttaa 
6000 

gaatcagaat ctcgcagtgt 
6060 

attgtattat gtaaaatatg 
6120 

agagtttgtg aagctaaata 
6180 

ccagtggaag agacatccct 
6240 

tttccttccc caccccccag 
6300 

ctttttaaag agattatttg 
6360 

cagaactaag cactttgtta 
6420 

aaaaaatcag gaatttaaaa 
6480 

gtccgaaata atagcaattc 
6540 



gttacngatc ataaaacttt 
atacacgtag gtgggccaga 
ctctagttgc agtattggac 
ttgacccttt gtccagtgga 
tgctctgtcc ctattatttg 
atgtcctgta atacttcacc 
ggaaatgaga tigacagtatt 
gctgagtgca ctttatttaa 
atagtgaaac acatccctgg 
ttttttttaa agtactggtg 
aaataaactt: ctgaaaaccc 
ttctgtgaat agattttttt 
tatatacctt tttttgtagg 
tttaacattg ttgatttcag 
tgacttttgt ggcctggggg 
ccttagatgc ctcgctcttt 
tttagatgta ggcattttaa 
atttgggggg aaagaataga 
aaaacgagca atttgaagag 
atgggctgtg tgtgtgtgtg 



tgtgtcactt ttttctgcct 
gaacattgga agaaticaaga 
aaagagcata gtcccagcct 
attggtgatt ctgaattgtc 
ccctaggctt tctcctaatg 
tccaggaact gtcatggatg 
taatcgcagc agtagcaaac 
aaagaatgga taaatgcaat 
tttttgccta cacttacgtg 
tcaccctttg cctatatggt 
aaggccaggt actgcattct 
gtaaatatga cctttaagat 
tcacaacaac tcatttttac 
taagctgtgt ggtgaggcta 
aggggtagtg caccacagct 
tcaatctctt aatctaaatig 
ttttttaaaa attcctctac 
tatggggaaa taaacttaaa 
aatcttttgg attttaagca 
tatgtgtgtg tgtgtgtgtg 
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tatgtttaat tatgttacct tttcatcccc 
6600 

agacctgaat cccgcggccg ccccgggcgt 
6660 

aactgtgcat cgtgcaccat ctcaatttct 
6720 

tgtaactata ctcctctaag tttcaatctt 
6780 

ttaaatgact agaattaatg cccatctttt 
6840 

ttacgagggc ttattcagaa gctttggact 
6900 

tcaaggttgt cggcttgtct accttgccag 
6960 

tcgttggtag atacgttgtt gacacttcta 
7020 

ttattattaa ataagttata aaaaaaataa 
7080 

ttttaaaacg aaaattcttg ttcttgagta 
7140 

aggtatagca tgaggtcgct cttattgacc 
7200 

taccctatga acatattcca ttttgtaatt 
7260 

taaagtttat gtacaaatat cataaaaaaa 
7320 

cttcttcggc gacagcatca ccgacttcgg 
7380 

ttctgatacc tgcatccaaa acctttttaa 
7440 

gcaagCtcaa tgacaatttc aacatcattg 
7500 

ccttattctt tggcaaatct ggagcagaac 
7560 

tgttcttgtc bggcaaagag gccaaggacg 
7620 

taacggaggc ttcatcggag atgatatcac 
7680 

ttaggtgggt tgggttctta actaggatca 
7740 
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tttaggagcg ttttcagatt ttggttcgta 

agatactgaa aaaccccgca agttcacttc 
ttcatttata catcgttttg ccttctttta 
ggccatgtaa cctctgatct atagaatttt 
ttttggacct aaattcttca tgaaaatata 
tcttcgccag aggtttggtc aagtctccaa 
aaatttacga aaagatggaa aagggtcaaa 
aataagcgaa tttcttatga tttatgattt 
gtgtatacaa attttaaagt gactcttagg 
actctttcct gtaggtcagg ttgctttctc 
acacctctac cggcatgccc gaaattcccc 
tcgtgtcgtt t.ctattatga atttcattta 
gagaatcttt ttaagcaagg attttcttaa 
tggtactgtt ggaaccacct aaatcaccag 
ctgcatcttc aatggcctta ccttcttcag 
cagcagacaa gatagtggcg atagggtcaa 
cgtggcatgg ctcgtacaaa ccaaatgcgg 
cagatggcaa caaacccaag gaacctggga 
caaacatgtt gctggtgatt ataataccat 
tggcggcaga atcaatcaat tgatgttgaa 



12/23 



wo 2005/049804 

ccttcaatgt aggaaattcg ttcttgatgg 
7800 

aagaggccaa aacattagct ttaCccaagg 
7860 

gggccatgaa agcggccatt cttgtgattc 
7920 

tatcccaagc gacaccatca ccatcgtctt 
7980 

ctaattctct gacaacaacg aagtcagtac 
8040 

agtctaaaag agagtcggat gcaaagctac 
8100 

ctttacggat ttttagtaaa ccttgttcag 
8160 

cacccacagc acctaacaaa acggcatcaa 
8220 

gaagtgggac acctgtagca tcgatagcag 
8280 

acttigacatt: ggaacgaaca tcagaaat:ag 
8340 

tttcttgacc aacgtggtca cctggcaaaa 
8400 

tggtatatcc ttgaaatata tatatatiatt 
8460 

aagtaagacg attgctaacc acctattgga 
8520 

aacttcaagt attgtgatgc aagcatttag 
8580 

gccggttccg gcctctcacc tttccttttt 
8640 

gcgtcaggcg acctctgaaa ttaacaaaaa 
8700 

atagcgcccc tgtgtgttct cgttatgttg 
8760 

gaactcttgc atcttacgat acctgagtat 
8820 

gaggatcaat tcgtaatcat ggtcatagct 
8880 

aatticcacac aacatacgag ccggaagcat 
8940 
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tttcctccac agtttttctc cataaticttg 

accaaatagg caatggtggc tcatgtitgta 
tttgcacttc tggaacggtg tattgttcac 
cctttctctt accaaagtaa atacctccca 
ct:ttagcaaa ttgtggcttg attggagata 
atggtcttaa gttggcgtac aattgaagtt 
gtictaacact acctgtaccc catttaggac 
ccttcttgga ggcttccagc gcctcatctg 
caccaccaat taaatgattt tcgaaatcga 
ct:tt:aagaac cttaatggct tcggctgtga 
cgacgaCctt cttaggggca gacattagaa 
gctgaaatgt aaaaggtaag aaaagttaga 
aaaaacaata ggtccttaaa taatattgtc 
tcatgaacgc tcctctattc tatatgaaaa 
ctcccaattt: ctcagttgaa aaaggt:at:at 
atttccagtc accgaatttg attctgtgcg 
aggaaaaaaa taatggttgc taagaga^tc 
tcccacagtt ggggatctcg actctagcta 
gcttcctgtg tgaaattgtt atccgctcac 
aaagtgtaaa gcctggggtg cctaatgagt 
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gaggtaactc acattaattg 
9000 

gtgccagctg gattaatgaa 
9060 

ctcttccgct tcctcgctca 
9120 

atcagctcac tcaaaggcgg 
9180 

gaacatgtga gcaaaaggcc 
9240 

gtttttccat aggctccgcc 
9300 

gtggcgaaac ccgacaggac 
9360 

gcgctctcct gttccgaccc 
9420 

aagcgtggcg ctttctcata 
9480 

ctccaagctg ggctgtgtgc 
9540 

taactatcgt cttgagtcca 
9600 

tggt:aacagg attagcagag 
9660 

gcctaactac ggctacacta 
9720 

taccttcgga aaaagagttg 
9780 

tggttttttt gtttgcaagc 
9840 

tttgatcttt tctacggggt 
9900 

ggtcatgaga ttatcaaaaa 
9960 

taaatcaatc taaagtatat 
10020 

tgaggcacct atctcagcga 
10080 

cgtgtagata actacgatac 
10140 



cgttgcgctc actgcccgct 
tcggccaacg cgcggggaga 
ctgactcgct gcgctcggtc 
taatacggtt atccacagaa 
agcaaaaggc caggaaccgt 
cccctgacga gcatcacaaa 
tataaagata ccaggcgttt 
tgccgcttac cggatacctg 
gctcacgctg taggtatctc 
acgaaccccc cgttcagccc 
acccggtaag acacgactta 
cgaggtatgt aggcggtgct 
gaaggacagt atttggtatc 
gtagctcttg atccggcaaa 
agcagattac gcgcagaaaa 
ctgacgctca gtggaacgaa 
ggatcttcac ctagatcctt 
atgagtaaac ttggtctgac 
tctgtctatt tcgttcatcc 
gggagggctt accatctggc 
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ttccagtcgg gaaacctgtc 

ggcggtttgc gtattgggcg 
gttcggctgc ggcgagcggt 
tcaggggata acgcaggaaa 
aaaaaggccg cgttgctggc 
aatcgacgct caagtcagag 
ccccctggaa gctccctcgt 
tccgcctttc tcccttcggg 
agttcggtgt aggtcgttcg 
gaccgctgcg ccttatccgg 
tcgccactgg cagcagccac 
acagagttct tgaagtggtg 
tgcgctctgc tgaagccagt 
caaaccaccg ctggtagcgg 
aaaggatctc aagaagatcc 
aactcacgtt aagggatttt 
ttaaattaaa aatgaagttt 
agttaccaat gcttaatcag 
atagttgcct gactccccgt 
cccagtgctg caatgatacc 
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gcgagaccca cgctcaccgg ctccagattt 
10200 

cgagcgcaga agtggtcctg caactttatc 
10260 

ggaagctaga gtaagtagtt: cgccagttaa 
10320 

aggcatcgtg gtgtcacgct: cgtcgtttgg 
10380 

atcaaggcga gttacatgat cccccatgtt 
10440 

tccgatcgtt gtcagaagta agttggccgc 
10500 

gcataattct cttactgtca tgccatccgt 
10560 

aaccaagtica titctgagaat agtgCatgcg 
10620 

acgggataat accgcgccac atagcagaac 
10680 

ttcggggcga aaactctcaa ggatcttacc 
10740 

tcgtgcaccc aactgatctt cagcatcttt 
10800 

aacaggaagg caaaatgccg caaaaaaggg 
10860 

catactcttc ctttttcaat attattgaag 
10920 

atacatattC gaatgtattt agaaaaaCaa 
10980 

aaaagtgcca cctgacgtict aagaaaccat 
11040 

gcgtatcacg aggccctttc gtctcgcgcg 
11100 

catgcagctc ccggagacgg tcacagcttg 
11160 

ccgtcagggc gcgtcagcgg gtgttggcgg 
11220 

agagcagatt gtactgagag tgcaccataa 
11280 

ttcttctcat gtatatatat atacaggcaa 
11340 



atcagcaata aaccagccag ccggaagggc 
cgcctccatc cagtctatta attgttgccg 
tagtttgcgc aacgttgttg ccattgctac 
tatggcttca ttcagctccg gttcccaacg 
gtigcaaaaaa gcggtCagct ccttcggtcc 
agtgttatca ctcatggtta tggcagcact 
aagatgcttt tctgtgactg gtgagtactc 
gcgaccgagt tgctcttgcc cggcgtcaat 
tctaaaagtg ctcaticattg gaaaacgt:t:c 
gctgttgaga tccagctcga tgtaacccac 
tactttcacc agcgtttctg ggtgagcaaa 
aataagggcg acacggaaat gttgaatact 
catttatcag ggttattgtc tcatgagcgg 
acaaataggg gttccgcgca catttccccg 
tatcatcatg acattaacct ataaaaatag 
tttcggtgat gacggtgaaa acctctgaca 
tctgtaagcg gatgccggga gcagacaagc 
gtgtcggggc tggcttaact atgcggcatc 
cgcatttaag cataaacacg cactatgccg 
cacgcagata taggtigcgac gtgaacagtg 
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agctgtatgt gcgcagctcg 
11400 

aagttcctat tccgaagttc 
11460 

ttgaaaacca aaagcgctct 
11520 

aacgagctac t aaaa t a t tg 
11580 

tgctatatat ctctgtgcta 
11640 

acttgcatct aaactcgacc 
11700 

caaaaaaatt gtagtaagaa 
11760 

catttcctat acgtagtata 
11820 

aatgaagaat catcaacgct 
11880 

atagaatata atcggggatg 
11940 

tcagtaaacg cgggaagtgg 
12000 

tagccttctt ctaaccttaa 
12060 

agagcgcaca aaggagaaaa 
12120 

gggatgcatt tttgtagaac 
12180 

tcgcgttgca tttctgttct 
12240 

ctctcgcgtt gcatttttgt 
12300 

cgctttcgcg ttgcatttct 
12360 

tagcgctctc gcgttgcatt 
12414 



cgttgcattt tcggaagcgc 
ctattctcta gctagaaagt 
gaagacgcac tttcaaaaaa 
cgaataccgc ttccacaaac 
tatccctata taacctaccc 
tctacatttt ttacgtttat 
ctattcatag agtgaatcga 
tagagacaaa atagaagaaa 
atcactttct gttcacaaag 
cctttatctt gaaaaaatgc 
agtcaggctt tttttatgga 
cggacctaca gtgcaaaaag 
aaagtaatct aagatgcttt 
aaaaaagaag tatagattct 
gtaaaaatgc agctcagatt 
ttitacaaaaa tgaagcacag 
gttctgtaaa aatgcagctc 
tttgttctac aaaatgaagc 



PCT/US2004/038506 

tcgttttcgg aaacgctttg 

ataggaactt cagagcgctt 
ccaaaaacgc accggactgt 
attgctcaaa agtatctctt 
atccaccttt cgctccttga 
ctctagtatt actctttaga 
aaacaatacg aaaatgtaaa 
ccgttcataa ttttctgacc 
tatgcgcaat ccacatcggt 
acccgcagct tcgctagtaa 
agagaaaata gacaccaaag 
ttatcaagag actgcattat 
gttagaaaaa tagcgctctc 
ttgttggtaa aatagcgctc 
ctttgtttga aaaattagcg 
attcttcgtt ggtaaaatag 
agattctttg tttgaaaaat 
acagatgctt cgtt 



<210> 18 

<211> 7254 

<212> DNA 

<213> artificial sequence 
<220> 
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<223> pGBDRXRalpha 
<400> 18 

gcttgcatgc aacttctttt cttttttttt cttttctctc tcccccgttg ttgtctcacc 
60 



atatccgcaa cgacaaaaaa 
120 

cagcaccaac agatgtcgtt 
180 

tttttccttc cttcactcac 
240 

tccagttact tgaattitgaa 
300 

cctgcaatta ttaatctttt 
360 

ctttttctgc acaatatttc 
420 

agcctcctga aagatgaagc 
480 

taaaaagcCc aagtgctcca 
540 

ggagtgtcgc tactctccca 
600 

agtggaatca aggctagaaa 
660 

ccttgacatg attttgaaaa 
720 

atttigtacaa gataatgtga 
780 

tgatatgcct ctaacattga 
840 

tagtiaacaaa ggtcaaagac 
900 

tcccatcagc accctgagct 
960 

ctcccccatg ggcccccact 
1020 

tggcagcccc cagctcagct 
1080 

ccccctgggc ctcaatggcg 

1140 



aatgatggaa gacactaaag 
gttccagagc tgatgagggg 
gcacactact ctctaatgag 
ataaaaaaag tittgccgctt: 
gtttcctcgt cattgttctc 
aagctatacc aagcatacaa 
tactgtcttc tatcgaacaa 
aagaaaaacc gaagtgcgcc 
aaaccaaaag gtctccgctg 
gactggaaca gctatttcta 
tggattcttt acaggatata 
ataaagatgc cgtcacagat 
gacagcatag aataagtgcg 
agttgactgt atcgccggaa 
cccccatcaa cggcatgggc 
ccatgtcggt gcccaccaca 
cacctatgaa ccccgtcagc 
tcctcaaggt ccccgcccac 



gaaaaaatta acgacaaaga 
tatcttcgaa cacacgaaac 
caacggcata cggcctrtcct 
tgctatcaag tataaataga 
gttccctttc ttccttgttt 
tcaactccaa gcttgaagca 
gcatgcgata tttgccgact 
aagtgtctga agaacaactg 
actagggcac atctgacaga 
ctgatttttc ctcgagaaga 
aaagcattgt taacaggatt 
agatt:ggct:t cagtggagac 
aca tea teat cggaagagag 
ttcccgggac agctgcattc 
ccgcctttct cggtcatcag 
cccaccctgg gcttcagcac 
agcagcgagg acatcaagcc 
ccctcaggaa acatggcttc 
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cttcaccaag cacatctgcg 
1200 

gtacagctgc gaggggtgca 
1260 

cacctgccgc gacaacaagg 
1320 

ctgccgctac cagaagtgcc 
1380 

gcagcgtggc aaggaccgga 
1440 

catgccggtg gagaggatcc 
1500 

cgtggaggca aacatggggc 
1560 

ccaagcagcc gacaaacagc 
1620 

ctcagagctg cccctggacg 
1680 

catcgcctcc ttctcccacc 
1740 

gctgcacgtc caccggaaca 
1800 

gctgacggag cttgtgtcca 
1860 

cctgcgcgcc atcgtcctct 
1920 

ggaggcgctg agggagaagg 
1980 

agagcagccg ggaaggttcg 
2040 

gctcaaatgc ctggaacatc 
2100 

cttccttatg gagatgctgg 
2160 

ttgtgcccac ccgttctggc 
2220 

ctgtccctgc ccttctctgc 
2280 

tgcctaagag atgtgttgtc 
2340 



ccatctgcgg ggaccgctcc 
agggcttctt caagcggacg 
actgcctgat tgacaagcgg 
tggccatggg catgaagcgg 
acgagaatga ggtggagtcg 
tggaggctga gctggccgtg 
tgaaccccag ctcgccgaac 
ttttcaccct ggtggagtgg 
accaggtcat cctgctgcgg 
gctccatcgc cgtgaaggac 
gcgcccacag cgcaggggtg 
agacgcggga catgcagatg 
ttaaccctga ctccaagggg 
tctatgcgtc cttggaggcc 
ctaagctctt gctccgcctg 
tcttcttctt caagctcatc 
aggcgccgca ccaaatigact 
caccctgcct ggacgccagc 
ctggcctgtt tggactttgg 
accctcctta tttctgttac 
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tcaggcaagc actatggagt 

gtgcgcaagg acctgaccta 
cagcggaacc ggtgccagta 
gaagccgtgc aggaggagcg 
accagcagcg ccaacgagga 
gagcccaaga ccgagaccta 
gaccctgtca ccaacatttg 
gccaagcgga tcccacactt 
gcaggctgga atgagctgct 
gggatcctcc tggccaccgg 
ggcgccatct ttgacagggt 
gacaagacgg agctgggctg 
ctctcgaacc cggccgaggt 
tactgcaagc acaagtaccc 
ccggctctgc gctccatcgg 
ggggacacac ccattgacac 
taggcctgcg ggcccatcct 
tgttcttctc agcctgagcc 
ggcacagcct gtcactgctc 
tacttgtctg tggcccaggg 
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cagtggcttt cctgagcagc agccttcgtg 
2400 

ctccccaccg ggctctcagg acgccctgcc 
2460 

cttcggcccc agccctggag ctgcagccaa 
2520 

tgatttttat tattaaataa gttataaaaa 
2580 

cttaggtttt aaaacgaaaa ttcttgttct 
2640 

tttctcaggt atagcatgag gtcgctctta 
2700 

gtgcacaaac aatacttaaa taaatactac 
2760 

acgaaatttig ctattttgtt agagtctttt 
2820 

tcaacaccaa taacgccatt taatctaagc 
2880 

ccagctaaca Caaaatgtaa gctttcgggg 
2940 

gagttccaat ccaaaagttc acctgtccca 
3000 

gaatgaggtt tctgtgaagc tgcactgagt 
3060 

cttttaataa ctggccttac tccaaagaca 
3120 

aaacctttat gctcagaaaa ctactgaccg 
3180 

gactcatctc catgcagttg gacgatatca 
3240 

tcctccttag gttgattacg aaacacgcca 
3300 

ttatatgctt ttacaagact tgaaattttc 
3360 

ctatitgggca cacatataat acccagcaag 
3420 

gcctctgtgc tctgcaagcc gcaaacttCc 
3480 

ataacagaca bactccaagc Cgcctttgtg 
3540 
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gcaagaacta gcgtgagccc agccaggcgc 

acacccacgg ggcttgggcg actacagggt 

gctaattccg ggcgaatttc ttatgattta 

aaataagtgt atacaaattt taaagtgact 

tgagtaactc tttcctgtag gtcaggttgc 

ttgaccacac ctctaccggc atgccggcaa 

tcagtaataa cctatttctt agcatttttg 

acaccatttg tctccacacc tccgcttaca 

gcatcaccaa cattttctgg cgtcagtcca 

ctctcttgcc ttccaaccca gtcagaaatc 

cctgcttctg aatcaaacaa gggaataaac 

agtatgttgc agtcttttgg aaatacgagt 

cttcgacgtg actcatcata caacgtcaga 

aaaccgagga actcttggta ttcttgccac 

atgccgtaat cattgaccag agccaaaaca 

accaagtatt tcggagtgcc tgaactattt 

cttgcaataa ccgggtcaat tgttctcttt 

tcagcatcgg aatctagagc acattctgcg 

accaatggac cagaactacc tgtgaaatta 

tgcttaatca cgtatactca cgtgctcaat 
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agtcaccaat gccctccctc ttggccctct 
3600 

aatcatggtc atagctgttt cctgtgtgaa 
3660 

tacgagccgg aagcataaag tgtaaagcct 
3720 

taattgcgtt gcgctcactg cccgctttcc 
3780 

aatgaaccgg ccaacgcgcg gggagaggcg 
3840 

cgctcactga ctcgctgcgc tcggtcgttc 
3900 

c^ggcggtaat acggttatcc acagaatcag 
3960 

aaggccagca aaaggccagg aaccgtaaaa 
4020 

tccgcccccc tgacgagcat cacaaaaatc 
4080 

caggactata aagataccag gcgtttcccc 
4140 

cgaccctgcc gcttaccgga tacctgtccg 
4200 

ctcatagctc acgctgtagg tatctcagtt 
4260 

gtgtgcacga accccccgtt cagcccgacc 
4320 

agtccaaccc ggt:aagacac gacttatcgc 
4380 

gcagagcgag gtatgtaggc ggtgctacag 
4440 

acactagaag gacagtattt ggtatctgcg 
4500 

gagttggtag ctcttgatcc ggcaaacaaa 
4S60 

gcaagcagca gactacgcgc agaaaaaaag 
4620 

cggggtctga cgctcagtgg aacgaaaact 
4680 

caaaaaggat cttcacctag atcctttt:aa 
4740 
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ccttttcttt tttcgaccga attaattcgt 

attgttatcc gctcacaatt ccacacaaca 
ggggtgccta atgagtgagg taactcacat 
agtcgggaaa cctgtcgtgc cagctggatt 
gtttgcgtat tgggcgctct tccgcttcct 
ggctgcggcg agcggtatca gctcactcaa 
gggataacgc aggaaagaac atgtgagcaa 
aggccgcgtt gctggcgttt ttccataggc 
gacgctcaag tcagaggtgg cgaaacccga 
ctggaagctc cctcgtgcgc tctcctgttc 
cctttctccc ttcgggaagc gtggcgcttt 
cggtgtaggt cgttcgctcc aagctgggct 
gctgcgcctt atccggtaac tatcgtcttg 
cactggcagc agccactggt aacaggatta 
agttcttgaa gtggtggcct aactacggct 
ctctgctgaa gccagttacc ttcggaaaaa 
ccaccgctgg tagcggtggt ttttttgttt 
gatctcaaga agatcctttg atcttttcta 
cacgttaagg gattttggtc atgagattat 
attaaaaatg aagtt:t:taaa tcaatctaaa 
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gtatatacga gtaaactitgg tctgacagtt 
4800 

cagcgatctg tctatttcgt tcatccatag 
4860 

cgatacggga gggcttacca tctggcccca 
4920 

caccggctcc agatttatca gcaataaacc 
4980 

gtcctgcaac tttatccgcc tccatccagt 
5040 

gtagttcgcc agttaatagt ttgcgcaacg 
5100 



cacgctcgtc gtttggtatg gcttcattca 
5160 



catgatcccc catgttgtgc aaaaaagcgg 
5220 



gaagtaagtt ggccgcagtg ttatcactca 
5280 



ctgtcatgcc atccgtaaga tgcttttctg 
5340 



gagaatagtg tatgcggcga ccgagttgct 
5400 



cgccacatag cagaacttta aaagtgctca 
5460 

tctcaaggat cttaccgctg ttgagatcca 
5520 



gatcttcagc atcttttact ttcaccagcg 
5580 

atgccgcaaa aaagggaata agggcgacac 
5640 

ttcaatatta ttgaagcatt tatcagggtt 
5700 

gtatttagaa aaataaacaa ataggggttc 
5760 

acgtctaaga aaccattaCt atcatgacat 
5820 



cctttcgtct cgcgcgtttc ggtgatgacg 
5880 



agacggtcac agcttgtctg taagcggacg 
5940 
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accaatgctt aaccagtgag gcacctatct 

ttgcctgact ccccgtcgtg tagataacta 

gtgctgcaat gaCaccgcga gacccacgct 

agccagccgg aagggccgag cgcagaagtg 

ctattaattg tcgccgggaa gctagagtaa 

ttgttgccat tgctacaggc atcgtggtgt 

gctccggttc ccaacgatca aggcgagtta 

ttagctcctt cggtcctccg atcgttgtca 

tggttatggc agcactgcat aattctctta 

tgactggtga gtactcaacc aagtcattct 

cttgcccggc gticaatacgg gataataccg 

tcattggaaa acgttcttcg gggcgaaaac 

gttcgatgta acccactcgt gcacccaact 

tttctgggtg agcaaaaaca ggaaggcaaa 

ggaaatgttg aatactcata ctcttccttt 

attgtctcat gagcggatac atatttgaat 

cgcgcacatt tccccgaaaa gtgccacctg 

taacctataa aaataggcgt atcacgaggc 

gtgaaaacct ctgacacatg cagctcccgg 

ccgggagcag acaagcccgt cagggcgcgt 
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cagcgggtgt tggcgggtgt 
600O 

tgagagtgca ccataacgca 
6060 

atatatatac aggcaacacg 
6120 

agctcgcgtt gcattttcgg 
6180 

aagttcctat tctctagcta 
6240 

cgctctgaag acgcactttc 
6300 

aaacttttgg ttttcgcgag 
6360 

attgctcgat gatttatatt 
6420 

ttgctatata tctctgtgct 
6480 

aacttgcatc taaactcgac 
6540 

acaaaaaaat tgtagtaaga 
6600 

acatttccta tacgtagtat 
6660 

caatigaagaa tcatcaacgc 
6720 

tatagaatat aatcggggat 
6780 

atcagtaaac gcgggaagtg 
6840 

gtagccttct tctaacctta 
6900 

tagagcgcac aaaggagaaa 
6960 

cgggatgcat ttttgtagaa 
7020 

ctcgcgttgc atttctgttc 
7080 

gctctcgcgt tgcatttttg 
7140 



cggggctggc ttaactatgc 
tttaagcata aacacgcact 
cagatatagg tgcgacgtga 
aagcgctcgt tttcggaaac 
gaaagtatag gaacttcaga 
aaaaaaccaa aaacgcaccg 
acttctgcgt gaaagttttt 
gcgaataccg cttccacaaa 
atatccctat ataacctacc 
ctctacattt tttatgttta 
actattcata gagtgaatcg 
atagagacaa aatagaagaa 
tatcactttc tgttcacaaa 
gcctttatct tgaaaaaatg 
gagtcaggct ttttttatgg 
acggacctac agtgcaaaaa 
aaaagtiaatc taagatgctt 
caaaaaagaa gtatagatec 
tgtaaaaacg cagctcagat 
C^ttacaaaa atgaagcaca 



ggcatcagag cagattgtac 
atgccgttct tctcatgtat 
acagtgagct gtatgtgcgc 
gctttgaagt tcctattccg 
gcgcttttga aaaccaaaag 
gactgtaacg agctactaaa 
tggtttttgc gtggcctgac 
cattgctcaa aagtatctct 
catccacctt tcgctccttg 
tctctagtat tactctttag 
aaaacaatac gaaaatg^aa 
accgttcata attttctgac 
gtatgcgcaa tccacatcgg 
cacccgcagc ttcgctagta 
aagagaaaat: agacaccaaa 
gct^atcaaga gactgcatta 
tgttagaaaa atagcgctct 
tttgttggta aaatagcgct 
tctttgtttg aaaaattagc 
gattcttcgt tggtaaaata 
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gcgctctcgc gttgcatttc tgttctgtaa aaatgcagct cagattcttt gtttgaaaaa 
7200 

ttagcgctct cgcgttgcat ttttgttcta caaaatgaag cacagatgct tcgt 
7254 
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